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The present inveotioa is directed to materials and methods for the quasi-random and completB fragmentation of DNA using restriction 
endonoclease reagents capable of cutting DNA at a dinucleotide sequence. The invention is also directed to methods for labeling DNA, for 
shotgun cloning, for sequenong of DNA, for ^tc^ mapping and for anonymous primer cloning, all using fragments of DNA generated 
by the method of the present invention. In addition, the present invention is ditectDd to DNA sequences encoding a novel restriction 
endoDuclease (designated R. Cvi JD and variants thereof as well as to methods and materials for prodnctioo of the same by recombinant 
methods. A bacterial host cell transfomoed with DNA encoding R. Cvi H is also disclosed as weU as methods for expressing R. Cvi JI in 
the bacterial host system and subseque n t maierixds and methods for purifying the enzyme. 
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DINUCLEOTIDE RESTRICTION ENDONUCLEASE PREPARATIONS AND METHODS OF 
USE. 

FEELD OF T HE INVENTIQN 

The present invention relates generally to isolated purified 
polynucleotides which racode restriction enzymes and to methods of expressing 
the restriction enzymes firom such polynucleotides. More particularly this 
invention relates to isolated purified polynucleotides which encode CViJI and 
related methods for the production of this enzyme. 

Other aspects of the invention relate to methods for partially or 
completely digesting DNA at a dinudeotide sequence. More particularly, this 
aspect of the invention relates to methods of generating quasi-random fragments 
of DNA, and methods of cloning, labeling, and sequencing DNA, as well as 
epitope mapping of proteins. The invention also relates to methods for generating 
sequence-specific oligonucleotides from DNA, without prior knowledge of the 
nucldc add sequence of such DNA, and to methods for doning and labeling 
DNA after restriction digestion by a two base recognition endonudease reagent. 
This invention also relates to methods for cloning, labding, and detecting nudeic 
acids using two base restriction endonudease reagents, such as CviJ I, BsuR I, 
Aci I or CGase I. Further the invention relates to labding DNA by taking 
advantage of certain properties of the holo-enzyme of thermostable DNA 
polymerases. 

BACKGROTTNn OF TIT E INVENTIQN 
Restriction endonucleases are a group of enzymes originally found 
to be expressed in a wide variety of prokaryotic organisms. More recently they 
have also been found to be encoded in viral genomes. These enzymes catalyze 
the sdective cleavage of DNA at generally short sequences, often unique to the 
individual enzyme. This ability to deave makes restriction endonudeases 
indispensible tools in recombinant DNA technology. The increased commercial 
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availability of the isolated enzymes has contributed in large part to the enormous 
expansion in the field of recombinant DNA technology over the last few years. 

These enzymes have been classified into three groups. Because of 
properties of the type I and type m enzymes, they have not bem widely used in 
molecular biology applications, and will not t>e discussed further. Type n 
enzymes are part of a binary system known as a restriction modification ^stem 
consisting of a restriction endonuclease that cleaves a specific sequence of 
nucleotides and a sq)arate DNA modifying enzyme that modifies the same 
recognition sequence and thereby prevents cleavage by the cognate endonuclease. 
A total of about 2103 restriction enzymes are known, encompassing 179 different 
type n specificities (Roberts, et al., NucL Acids Res. 20:2167-2180 (1992)). 
Although there are more than 1200 type U restriction enzymes, many of them are 
members of groups which recognize the same sequence. Restriction enzymes that 
recognize the same sequence are said to be isoschizomers. 
15 The vast majority of type n restriction enzymes recognize specific 

double-stranded sequences which are four, five, or six nucleotides in length and 
which display twofold (palindromic) symmetry. A few enzymes recognize longer 
sequences or degenerate sequences. 

The location of cleavage sites within a palindrome differs from 
oizyme to enzyme. Some enzymes cleave both strands exactiy at the axis of 
symmetry generating fragments of DNA that carry blunt ends, while others cleave 
each strand at similar sequences on opposite sides of the axis of symmetry, 
creating fragments of DNA that carry protruding, single-stranded termini. 

Restriction endonucleases with shorter recognition sequences cut 
25 DNA more frequently than those with longer recognition sequences. For 
example, assuming a 50% G-C content, a restriction endonuclease with a 4-base 
recognition sequence will cleave, on average, every 4^ (256) bases compared to 
every 4^ (4096) bases for a restriction endonuclease with a 6-base recognition 
sequence. Under certain conditions some restriction ^donucleases are capable 
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of (deaving sequences which aie similar but not identical to their defined 
recognition sequence. This altered specificity has been termed "star* (*) activity 
and is observed only under certain non-standard reaction conditions. The manner 
in which an en^me's specificity is altered depends on the particular enzyme and 
5 on the conditions employed to induce the star activity. Conditions that contribute 
to star activity include high glycerol concentration, high ratio of enzyme to DNA, 
low ionic strength, high pH, the presence of organic solvents, and the substitution 
of Mg"*" with other divalent cations. The most common types of star activity 
involve cutting at a recognition sequence having a single base substitution, cutting 
10 at sites having truncation of the outer bases of the recognition sequence, and 
single-strand nicking. The following restriction endonucieases show star activity: 
Ase I, BamH I, BssH H, BsuR I, CviJ I, EcoR I, EcoR V, Hind m, Hinf I, Kpn 
I, Pst I, Pvu n, Sal I, Sea I, Taq I, and Xmn I. Star activity is generally viewed 
as undesirable, and of little intrinsic value. 
15 Of the 179 unique type U restriction endonucieases, 31 have a 4- 

base recognition sequence, 11 have a S-base recognition sequence, 127 have a 6- 
base recognition sequence, and 10 which have recogiution sequences of greater 
than 6 bases. In two cases, a restriction endonuclease has a recognition sequence 
of less than 4 bases. 

The restriction enzyme CviJ I has a three base recognition sequence 
or a two-base recognition sequence, dq>ending on the reaction conditions. Under 
normal reaction conditions CviJ I recognizes the sequence PuGCPy (wherein 
Pu=purine and Py=pyrimidine) and cleaves between the G and C to leave blunt 
ends (Xia et al., 1987. Nucleic Acids Res. 15:6075-6090). Under "relaxed" or 
25 "star" conditions (in the presence of 1 mM ATP and 20 mM DTT) the specificity 

of CviJ I may be altered to cleave DNA more fiequentiy. This activity is referred 
to as CviJ I*, for star or altered specificity. However, CvU I* activity is not 
observed imder conditions which favor star activity of other restriction 
endonucieases. 
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The restriction enzyme BsuR I noimally lecognizes the sequence 
GGCC and cleaves between the G and C to leave blunt ends. (Heininger, ef al. , 
Gene 1:291-303 (1977)). Under relaxed conditions (high pH, low ionic strength, 
and high glycerol concentration) the specificity of Bsu RI may be altered to cleave 
DNA more frequenfly. An isoschizomer of this enzyme, Hae HI, does not display 
this star activity. 

In bacteria, the restriction endonuclease provides a mechanism of 
defense against foreign DNA molecules (e.g., bacteriophage DNA) by virtue of 
its ability to distinguish and cleave only exogenous DNA, leaving endogenous 
bacterial DNA unaffected. Viral endonucleases possess the same discerning 
capabilities, but rather than providing a means for defense, this activity has 
presumably evolved to cripple the host's ability to replicate its own DNA and 
allows the virus to assume control of the host's rq)lication machinery. 

Bacteria and viruses which express restriction endonucleases 
necessarily possess the inherent ability to protect thdr own genome from cleavage 
by thdr endogenous endonuclease. The primary mechanism by which this is 
accomplished is by modifying the oiganisms own DNA by, for example 
methylating a base in the recognition sequence which prevents binding and 
cleavage by the oidonuclease. Therefore, to insure viability, the genome of an 
organism which expresses a restriction endonuclease is almost always heavily 
modified, usually by methylation of cytosine or adenosine bases. The methylase 
enzyme which modifies tiie genome (itself a useful tool in molecular biology) acts 
in tandem with the endonuclease, eitiier as part of an enzyme complex 
(restriction/modification complex) or as two distinct entities. Therefore, 
recognizing that an organism expresses an enzyme with endonuclease activity 
strongly suggests the expression of an associated modifying methylase enzyme 
(and vice versa) and this association has led to isolation and cloning of a number 
of commercially available restriction/modification enzymes for use in the 
laboratory as discussed below. 
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One of the limitations in the use of restriction endonucleases exists 
when deavage of a given sequoice is required and no known endonudease exists 
which is specific for that particular sequence. Therefore, the ccmtinued 
identification and isolation of unique restriction endonucleases and altered reaction 
S conditions will allow for even more sophisticated manipulation of DNA in vitro. 

A number of publications and patmts describe the cloning of DNAs 
encoding restriction endonucleases. Included among theses publications is Kiss. 
A., et al.. Nucleic Acid Research 13:6403-6421 (1985), which describes the 
cloned nucleotide sequence of the BsuRI restriction-modification system isolated 
10 from Bacillus subtillis. This system is specific for the sequence 5 '-GGCC-3 ' and 
is defined by two gene products which are transcribed by different promoters. 
The methylase component of the system shows homology to the methylase from 
the BspBl and SPR restriction-modification systems. 

Nwanko, D.O. and Wilson, G.G. Gene 64:1-8 (1988), describe the 
IS cloning and expression of tiie Mspl restriction and modification genes isolated 
from MoraxeUa sp. This system recognizes the sequence 5 '-CCGG-3 ' and both 
mzjmies are functional in E. coli. Evidence indicates that these gmes are 
transcribed in opposite directions, thus are probably under the control of different 
promoters. 

20 Ashok, K.D., ettd.. Nucleic Adds Research 20:1579-1585 (1992), 

describe the purification and charactmzation of cloned Mspl metfayltransferase, 
over-expressed in E. coli. At low concentrations the enzyme exists as a 
monomer, but at higher concentrations it exists mainly as a dimer. Polyclonal 
antibodies to the enzyme cross-react with methyltransferase genes of other 

25 modification systems. 

Brooks, J.E., et al. Nucleic Acids Research 19:841-850 (1991), 
characterizes the cloned BamHi restriction modification system from Bacillus 
subtilis. The two genes are divergentiy oriented and sq>arated by an open reading 
frame which may serve as a transcriptional regulator in the native bacteria. 
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Slatko, B.E., et al. Nucleic Adds Research 15:9781-9796 (1987), 
describe the cloning, sequencing and expression of the Tagl restriction- 
modification system. These genes have the same transcriptional orientation, with 
the methylase g«ie 5 ' to the endonuclease gene. E. coli clones which cany only 
the endonuclease gene are viable even in the absence of the methylase gene. This 
is an unusual case possibly explained by the 65°C optimal temperature for Tagl 
restriction and the 37^C optimal temperature for E. coli growth. 

Howard, K.A., et al.. Nucleic Acids Research 14:7939-7951 
(1986), describe the cloning of the Ddel restriction modification system from 
Desulfovibrio desuljuricans by a two step method wherein the methylase gene is 
first cloned and transformed into E. coli, followed by the cloning of the 
csidonudease gene and transformation of this second gene into the methylase- 
expressing bacteria. In order to maintain cell viability, high levels of methylase 
expression are required before the endonuclease gene can be introduced into the 
bacteria. 

Ito, H., et al.. Nucleic Acids Research 18:3903-3911 (1990), 
describe the cloning, nudeotide sequence and expression of theiBncII restriction- 
modification system. The DNA was isolated from H. irtftuenzae Rc, with the two 
genes positioned in the same transcriptional orientation. 

Shields, S.L., et al.. Virology 76:16-24 (1990), describe the 
cloning and sequencing of the cytosine methyltransferase gene M.CVzJI from the 
CMorella virus IL-3A. The metiiylase recognizes the sequence (G/A)GC(T/C/G) 
and shows amino acid sequence homology with S-methylcytosine methylases 
isolated from bacteria. DNA encoding the methylase was obtained from the viral 
genome which was propagated in the green alga host Chlorella. 

Xia, Y., et al.. Nucleic Acids Research 15:6075-6090 (1987), 
discovered that IL-3A virus infection of Chlorella-Uke^ green alga induces the 
expression of the DNA restriction endonuclease CwJI which has novel sequence 
specificity. This endonuclease recognizes the sequence PuGCPy (wherein Pu = 
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purine and Py = pyrimidine) but does not cut the sequence PuG"^CPy, where 
is S-methylcytosine. 

U.S. Patent 5,137,823, issued August 11, 1992, to Biooks, I.E., 
describes a two step method for cloning the BamHL restriction modification 
S system wherein the methylase is cloned first and then introduced into a bacterial 
hosL The endonuclease is th^ cloned and introduced into the methylase 
expressing bacteria. This two step procedure provides the host DNA protection 
from cleavage of the subsequently introduced endonuclease. 

U.S. Patent 5,200,333, ('333) issued April 6, 1993, to Wilson, 

10 G.G., describes a method for cloning restriction and modification genes. 

Specifically this reference describes the cloning of the Taql and HaelL systems 
from Thermus aquaxicus and Haemophilus aegypricus, respectively. In this 
method, bacterial DNA was initially purified and digested, and the fragments 
were then cloned into a vector to produce a bacterial DNA library. The library 

IS was then transformed into E. coll and the cdls were plated. Colonies were thai 
scraped fiom the plate to form a primary cell library. Plasmid DNA from Ais 
cell library was purified and digested with the endonuclease of the two gene 
system. Bacteria which expressed the methylase gene had modified plasmid DNA 
which was protected from mdonuclease activity, while plasmids fiom bacteria 

20 which lacked the intact methylase gene were digested. The resulting, undigested 
plasmid DNA was then transformed into another tiacterial strain and the bacteria 
were plated. Surviving colonies wm again harvested to give a secondary cell 
library and the entire proceduze repeated. Plasmids which code for the complete 
restriction-modification system presumably survived each roimd of purification 

25 and were enriched. Bacteria which survive several rounds of enrichment were 
subsequently assayed for both methylase and endonuclease activity. 

U.S. Patent 5,196,331, ('331) issued March 23, 1993, to W^iison, 
G.G. and Nwanko, D., describes a method for cloning the Mspl restriction and 
modification gmes. This patent describes a method identical to that of U.S. 
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Patent 5,200,333 (333). '331 is a continuation-in-part of, and '333 is a 
continuation of U.S.S.N, 707,079 (now abandoned)* 

As mentioned above, Chlorella virus IL-3A encodes a unique 
restriction endonuclease called CWJI (Xia et al Nucleic Acids Res, 15:6075-6090 
(1987)). IL-3A is a large, polyhedral, plaque-forming phycodnavirus (Francki, 
R.I.B., et al. Arch. ViroL suppL2. Springer- Verlag, Vienna (1991)) that replicates 
in unicellular, eukaryotic green algae, Chlorella strain NC64A (Schuster, A.M., 
et al Virology 150: 170-177 (1986)). The double-stranded DNA genome of IL-3A 
is approximately 330 kbp (Rohozinski et al,. Virology 168:363-369 (1989)) and 
contains 9.7% methylated cytidine (Van Etten, J.L. et al.. Nucleic Acids Res, 
13:3471-3478 (1985)). The cognate methyltransferase of CViJI, M.CVin, 
methylates (A/G)GC(T/C/G) sequences and, has been cloned and sequenced 
(Shields, S.L. et al.. Virology 176:16-24 (1990)). 

The use of a two/three base recognition endonuclease, such as 
CvzTI, to improve numerous convmtional molecular biology applications as well 
as permitting novd applications has been described in co-pending U.S. Patent 
AppKcation Ser.No. 08/036,481, ffled on March 24, 1993. The qjpHcation 
discloses methods for genmting sequence-specific oligonucleotides from DNA 
without prior knowledge of the nucldc add sequence of such DNA, and to 
methods for cloning and labeling DNA after restriction digestion by a two base 
recognition endonuclease. The application also teaches methods for gmerating 
quasi-random fragments of DNA, methods for cloning, labeling, and sequencing 
DNA, as well as epitope mapping of proteins. The ability to generate numerous 
oligonucleotides with perfect sequence spedfidty or quasi-random distributions 
of DNA fragments such as is possible with CviJI* has important implications for 
a number of conventional and novel molecular biology procedures. 

Infection of Chlorella spedes NC64A with the IL-3A virus 
produces sufficient CviJI restriction endonudease (CVfJI) for research purposes. 
However, production of commercially useful amounts of CwJI is limited with this 
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system due to the slow growth of CMorella algae, the large number of 
contaminating nucleases associated with the virus, and the small yield of enzyme 
obtained after purification. In addition, biochemical and biophysical 
characterization of the fflzyme, such as molecular weight determination, are 
5 difficult from the native source. Because of these limitations it would be useful 
to clone the gene for CViJI in order to provide an adequate large scale source of 
enzyme for use as a molecular biological reagent. 

SUMMARY OF THE INVENTION 
In one of its aspects, the present invention provides purified and 

10 isolated polynucleotides (e.g., DNA sequences and RNA transcripts thereof) 
encoding a imique restriction endonuclease, CulII, as well as polypeptides and 
variants thereof which display activities characteristic of CVzJI. Activities of CvHl 
indude the recognition of specific DNA sequences, binding to these sequences 
and cleaving die bound DNA into fragments. Preferred DNA sequences of the 

15 invention include viral genomic sequences as wdl as wholly or partially 
chemically synthesized DNA sequences. Replicas Q.e., copies of the isolated 
DNA sequences made in vivo or in vitro) of DNA sequmces of the invention axe 
also contemplated. A preferred DNA sequence is set forth in S£Q ID NO: 2 
herdn and is contained as an insert in the plasmid pCJHl.4. In another of its 

20 aspects, the invention provides purified isolated DNA encoding a CviJI 
polypq>tide by means of degenerate codons. 

Also provided are autonomously rq>licating recombinant 
constructions such as plasmid DNA vectors incorporating CvHl sequences and 
especially vectors wherein DNA encoding CvHl or a CvfJI variant is operatively 

25 linked to an endogenous or exogenous expression control DNA sequence. 

According to another aspect of the invention, host cells such as 
prokaryotic and eukaryotic cells, are stably transformed with DNA sequences of 
the invention in a manner allowing the desired polypeptides to be expressed 
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therein. Host cdls expressing CViII and CViJI variant products are useful in 
methods for the large scale production of CViJI and CvUl variants wherein the 
cells are grown in a suitable culture medium and the desired polypqitide products 
are isolated from the host cells or from the medium in which the cells are grown. 
5 A preferred host cell is E. coli. Still another aspect of the invoition is a 
recombinant CvUI polypeptide. 

The present invention is also directed to a method for the digestion 
of DNA with a restriction endonuclease reagent under conditions wherein said 
DNA is cleaved at a dinucleotide sequence selected from the group consisting of 

10 PyGCPy, PuGCPy, PuGCPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is also directed to a method for restriction 
endonuclease digestion of DNA comprising the stsp of digesting DNA with a 
restriction endonuclease reagent under conditions wherein said DNA is digested 
at 11 of 16 possible dinucleotide sequences and wherein said dinucleotide 

IS sequences are sdected from the group consisting of PuCGPu, PuCGPy, and 
PyCGPu, and wherein Pu = purine and Py = pyrimidine. 

The present invention is directed to shotgun cloning of DNA, 
epitope mapping, and for labeling DNA using the digestion methods of the present 
invention. The present invention provides methods for quasi-random fragmenting 

20 of DNA using the digestion methods of the present invention under conditions 
wherein the DNA is only partiaUy cleaved and the site preference of the 
restriction endonuclease reagent is greatly reduced. By quasi-random is meant an 
overlapping population of DNA fragments produced by digesting DNA using the 
methods of the present inventions without apparent site-prefermce and which 

25 appears as a smear upon electrophoresis in a 1-2 wt. % agarose gel. The present 

invention is also directed to the shotgun cloning and sequencing of quasi-random 
fragments of DNA produced by the methods of the preset invration. Quasi- 
random fragments in the shotgun cloning method of the present invention are 
produced by partial digestion of DNA with a restriction endonuclease reagent 
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according to the methods of the present invention. More particularly, quasi- 
random fragments of DNA useful in the cloning method of the present invention 
are produced by the partial digestion of the DNA to be cloned with CviJ I, BsuR 
I or with a restriction endonuclease reagent termed CGase I comprising Taq I and 
5 Hpa n. Quasi-random fragments having a length of between about 100 and about 
10,000 nucleotides are preferred. More preferred are quasi-random fragments of 
about 500 to about 10,000 nucleotides in length. The present invention is also 
directed to the generation of quasi-random fragmentation of DNA using the 
method of the present invention for the purx)Oses of epitope mapping and gene 

10 cloning. These quasi-random fragments are expressed either in yntro or in vivo 
and the smiallest fragment containing the desired function is identified by 
screening assays weU known in the art. 

The present invention is also directed to die production of 
anonymous primers from any DNA without prior knowledge of the nucleotide 

IS sequfflce. The present invention provides methods for anonymous primer cloning 
and sequencing after complete digestion of DNA utilizing CviJ I, BsuR I or 
CGase I using the methods of the present invention. 

Additionally, the present invention is directed to methods of 
labeling and detecting DNA comprising the complete digesdon of DNA using the 

20 methods of the present invention, followed by a heat doiaturation step, to yield 
sequence specific oligonucleotides. In particular, an aspect of the present 
invention involves labeling DNA with sequence specific oligonucleotides of about 
20 to about 200 bases in length (with an average size of between 20-60 bases) 
generated by CviJ I, BsuR I or CGase I digestion of the template DNA. 

25 More particularly, the invention is directed to restriction generated 

oligonucleotide labeling (RGOL) of DNA which comprises the digestion of an 
aliquot of template DNA with CviJ I followed by a simple heat denaturation step, 
thereby generating numerous sequence specific oligonucleotides, which can then 
be utilized for labeling nucleic adds by a number of methods, including primer 
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extension type reactions with a DNA polymerase and various labels, isotopic 
omon-isotqpic (RGOL-PEL); 5' end labeling with polynucleotide kinase: 3* end 
labeling using tmninal transferase and various labels,isotopic or non-isotopic. 
Labeling at the 3* end, also refeired to as tailing^ adds numerous labels per 
oligonucleotide (1-200), depending on the labeling conditions. The addition of 
10-500 oligonucleotides generated per template, results in a significant signal 
amplification not obtainable by conventional methods. 

The invention is also directed to thermal cycle labeling (TCL) 
which comprises the simultaneous labeling and amplification of probes utilizing 
CviJ I or CGase I restriction generated oligonucleotides as the starting material. 
In this method, natural DNA of unknown sequence is digested with CviJ I to 
gen^ate numerous double-stranded fragments which are then heat denatured to 
yield oligonucleotides. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of d^iaturation, armealing, and 
extension in the presence of a thermostable DNA polymerase or functional 
fragment thereof which maintains polymerase activity, deoxynucleotide 
triphosphates and the appropriate buffer. Alpha ^^p-jATP (or any of the other 
three deoxynucleotide triphosphates), biotin-dUTP, fluorescein-dUTP, or 
digoxigenin-dUTP is incorporated during the extension stq> for subsequent 
detection purposes. Thermal cycle labeling efficiently labels DNA while 
simultaneously amplifying large amounts of the labeled probe. In addition, TCL 
probes exhibit a 10 fold improvement in detection sensitivity compared to 
conventional probes. 

The present invention is also directed to TCL in which the 
thermostable DNA polymerase supplies endogenous primers for enzymatic 
extension. This method is referred to as Universal Thermal Cycle Labding 
(UTCL). In this method natural DNA of unknown sequence is combined intact 
with the holo-enzyme of a thermostable DNA polymerase, deoxyribonucleotide 
triphosphates, and the appropriate buffer. The holo-enzyme and its associated 
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mdogenous primers are then combined with intact template and subjected to 
repeated cycles of denatuiation annealing and eictm^on. Alpha ^^P-dATP, -^^P- 
dTTP, 22p^GTP, ^2p^CTp, biotin-dUTP, fluorescrin-^UTP, or digoxigenin- 
dUTP is also included in the ectension step for subsequent detection purposes. 
S Isotopic labels useful in the practice of the present invention include but are not 
limited to ^^P, ^^P, ^^S, ^^C and ^H. Non-isotqpic labds useful in the present 
invention include but are not limited to fluorescein biotin, dinitrophenol and 
digoxigenin. 

The present invention is also directed to an improved method for 
10 purifying CviJ I from the algae Chlorella infected with the virus IL-3A. 

In addition the present invention is directed to restriction 
endonuclease reagents which, under conditions which relax the sequence 
specificity of one or more restriction endonucleases, cleave DNA at the 
dinudeotide sequences AT or TA. 
IS The presrat invention is also directed to a restriction endonuclease 

reagent comprising in combination, Taq I and Hpa which is capable of 
digesting DNA at 11 of 16 possible dinudeotide sequraces, said sequences 
selected from the group consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, 
and wherein Pu = purine and Py = pyrimidine. 
20 The following examples are intended to be illustrative of the several 

aspects of the present invention and are not intended in any way to limit the scope 
of any asfpect of the present invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 is a map of the plasmid p710 which contains DNA 
25 sequences encoding for the IL-3A viral methyltransferase M.CVlH; 

Figure 2 is the nucleotide sequence of 5497 bp of cloned IL-3A 

viral DNA; 
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Figure 3 is a restriction map of the cloned IL-3A viral DNA, 
including the identified open reading frames; 

Figure 4 is the DNA sequence of the CviJI gene with its flanking 
regions. The predicted amino acid sequence is provided below the nucleotide 
S sequences; 

Figure 5A depicts the theoretical frequency and distribution of 
CV£JI* restriction generated oligomers of individual lengths; Figure 5B shows the 
actual frequency and distribution of CWJI* restriction generated oligomers of 
various lengths; 

10 Figure 6 is a flow chart depicting anonymous primer cloning; 

Figure 7 is a photographic reproduction of a gel depicting CvUI 
restriction digests of pUC19; 

Figure 8 is a photographic reproduction of a gel depicting 
comparisons of sonicated versus CVOI* partially digested DNAs; 
1^ Figure 9A is a photographic rq>roduction of an agarose gd 

electrophoresis analysis of size-fractionated DNA by micxDCOlumn 
chromatography compared to fractionation by agarose gel electroelution; 

Figure 9B-E illustrates additional trials of the same procedures 
used in Figure 9A; 

Figure lOA illustrates the size distribution of DNA fragments 
produced by partial digestion of DNA by C\nJl and fractionated by nucmcolumn 
chromatography; 

Figure lOB-C illustrates the size distribution of DNA fragments 
produced by partial digestion of DNA by CvOl and fractionated by agarose gd 
25 dectrophoresis; 

Figure 1 1 is a schematic depiction of the distribution of CViJI sites 
in pUC19; and 

Figure 12 is a graph of the rate of sequence accumulation by 
CVuI shotgun cloning and sequencing. 



20 
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DETAILED DESCMPTTON 
The gene for the lestricdon endonuclease R.CVilI was cloned into 
E. coll so as to provide an adequate source of R.Ci^ for use as a molecular 
biological reagent. Biologically active CvUI has been purified from E.coU to 
5 apparmt homogeneity. The molecular weight of E.coli derived R.CV£n is 32.S 
kD by SDS gel electrophoresis. N-t^minal amino add sequence analysis of this 
protein and comparison to the nucleotide sequence of the gene revealed that the 
translation of this enzyme is probably initiated vnth a GTG start codon, instead 
of the usual ATG initiation codon. The structural gene is 834 nucleotides in 

10 length coding for a protein of 278 amino acids (31.6 kD). A second peak of 
R. CViJI activity which eiutes sq)arately from the 32.5 kD form can be seen in the 
initial stages of enzyme purification. Trace amounts of a larger molecular weight 
form have not been observed to date. However, the R.Cv£JI gaie does possess 
an in-frame upstream ATG codon which if translated would yield a predicted 41.4 

IS kD protein. The structural gene for this potentially larger product is 1074 
nudeotides in length coding for a putative protein of 358 amino adds. 

The present invention is also directed to a method for the 
fragmentation and cloning of DNA using the restriction endonudease CviJ I under 
conditions which allow the enzyme to deave DNA at the dinudeotide sequence 

20 GC. In addition, the present invention is also directed to the cloning of quasi- 
random firagments of DNA digested using the fiagmmtation method of the present 
invention. 

As an alternative to the methods for constructing random clone 
libraries described above, methods were devised for the construction of such 
25 libraries which require fewer steps and reagents, which require smaller amoimts 

of DNA, which have relatively high cloning efficiendes and which takes less time 
to complete. These methods relate to the recognition that a partial digest with a 
two or three base recognition endonuclease cleaves DNA frequently enough to be 
functionally random with respect to the rate at which sequence data may be 
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accumulated from a shotgun done bank. The restriction enzyme CvU I normally 
recognizes the sequence PuGCPy and cleaves between the G and C to leave blunt 
ends (Xia et al,, Nucl Acids Res. 15:6075-6090 (1987)). Under "rdaxed" 
conditions (in the presence of 1 mM ATP and 20 MM DTT) the specifidty of 
S CviJ I can be altered to cleave DNA more frequmtly and perhs^s as frequently 
as at every GC. This activity is referred to as CviJ I*. Because of the high 
frequency of the dinucleotide GC in all DNA (16 bp average fragment size for 
random DNA), quasi-random libraries may be constructed by partial digestion of 
DNA with CviJ I*. A DNA degradation method with low levels of sequence 

10 specifidty produces a smear of the target DNA when analyzed by agarose gd 

dectrophoresis. Digestion of the plasmid pUC19 under partial CviJ I* conditions 
does not result in a non-discrete smear; rather, a number of discrete bands are 
found superimposed upon a light background of smearing, suggesting that CvU 
I has some site prefermce. Atypical reaction conditions according to the present 

IS invention eliminate this appamt site preference of CviJ I* to produce an activity 
(termed CvU I***) in combination with a rapid gd filtration size exclusion stq), 
streamlines a number of aspects involved in shotgun doning. 

One aspect of the present invention involves the use of the 
two/three base recognition endonuclease CvU I, in conjunction with a simple spin* 

20 column method to produce libraries equivalent in final form to those generated by 
the combination of sonication and agarose gel dectrodution. However, the 
method of the present invention requires fewer steps, a shorter time period, and 
significantly less substrate (nanogram amounts) when compared to conventional 
procedures. Both small and large sequencing projects using the methods 

25 described herein are within the scope of the present invention. 

Current sequendng paradigms require the gmeration of a new 
template for each 350-500 nucleotides sequenced. On this basis, sequencing both 
strands of the human genome would require at least 12 million templates 500 
nucleotides long, assuming no overlap betwem templates. 
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A random zppwach^ such as shotgim sequmcmg, would require 30 
to SO million templates, assuming the entire genome were randomly subdbned. 
As many as 250,000 libraries may be needed to generate the requisite templates 
from a subcloned and ordered array of this genome, deprading on the type of 
vector utilized, and the de^ee of overly between sudi clones. The ability to 
generate shotgun libraries in a semi-automated, microtiter plate format would 
greatly simplify such large scale projects. 

The development of methods for cloning large DNA molecules in 
yeast artificial chromosomes (Burke et al.. Science 236:806-812 (1987), or in 
bacteriophage Pl-derived vectors (Sternberg, Proc. Nail Acad. ScL USA 87:103- 
107 (1990)), simplifies the subdivision and analysis of very large genomes. 
However, the large size of the resulting subclones (100 - 1000 kbp) presents 
additional challenges for subsequent sequencing efforts. A report of the 
sequencing of a 134 kbp genome by random shotgim cloning directly into a 
bacteriophage M13 vector indicates that numerous intennediate stages of 
subcloning, mapping, and overlapping such clones may be eliminated (Davison, 
7. DNA Seq. and Mapping 1:389-394 (1992). An order of magnitude reduction 
in the amoimt of DNA required for shotgun cloning would substantially simplify 
efforts to directly sequence 100,000 bp sized molecules and beyond. 

The ability to generate an overlapping population of randomly 
fragmented DNA molecules is considered essmtial for minimizing the closure of 
nucleotide sequence gaps by the shotgun cloning method. The use of a very 
frequent-cutting restriction enzyme, such as CviJ I, is an approach which has not 
been utilized. Reaction conditions according to the present invention result in the 
quasirandom restriction of pUC19 and lambda DNA, as judged by the degree of 
smearing observed. 

The randonwiess of this CvLT I** reaction was quantified by 
sequence analysis of 76 such partialiy-fragmaited pUC19 subclones. The analysis 
is showed that CyiJ I** partial digestion (limiting enzyme and time) restricts DNA 
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at PyGCPy, PuGCPu, and PuGCTy (but not PyGCPu), and is thus a hybrid 
reaction which combines the three base recognition spedflty of CviJ I with the 
"two" base recognition specifity of CviJ I*. Interestingly, most of the "rdaxed" 
cleavage observed xmder CviJ I conditions occurred in those portions of the 
5 sequence which were deficient in "normal" restriction sites. CviJ I** treatment 
produces a relatively uniform size distribution of DNA fragments, permitting 
sequence information to be accumulated in a statistically random fashion. 

Shotgun cloning with CvU I** digested DNA is efficient partly 
because the resulting fragments are blunt ended. Other methods currently used 

10 to randomly-fragment DNA, including sonication, DNAse I treatment, and low 
pressure shearing, leave ragged ends which must be converted to blunt ends for 
efficient vector ligation. Other than a heat denaturation step to inactivate the 
endonuclease, no additional treatmmts are required for cloning CviJ I**"^ restricted 
DNA. In addition, the preligation step required to equalize rq>resentation of the 

IS ends of a DNA molecule prior to sonication or DNAse I treatment is not 
necessary with CviJ I*^*^ fragmentation. CviJ I*** cleaves its cognate recognition 
site very dose to the ends of a linear molecule, as judged by the very small 
fragments resulting from complete digestion of pUC19 as depicted in Figure 2, 
lane 1. 

20 The overall efficiency of shotgun cloning depends not only on the 

fragmentation process, but also upon the size fractionation procedure used to 
remove small DNA fragments. The efficiency of cloning agarose gel fr^tionated 
DNA was found to be unexpectedly variable. Numerous experiments produced 
an erratic distribution of sized material and the resulting cloned inserts were 

25 uniformly small (70% < 500 bp in one trial, 100% < 500 bp in another). The 

method of the present invention includes a simple and rapid micro-column 
fractionation method, which has resulted in three to thirteen times more 
transformants than agarose gel fractionation. More importantly, the size 
distribution of the cloned inserts from column-fractionated DNA was skewed 
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toward larger fragments (8896 > 500 bp). Micro-column fractionation also 
eUminates the chemical extraction steps required for agarose fractionated DNA. 
After the target DNA has been column-fractionated, no further treatments are 
required for cloning. Combining CviJ I** partial restriction with nucro-column 
fractionation pamits the construction of useful libraries from as litde as 200 ng 
of substrate, an order of magnitude less starting material than recoxmnended for 
sonication/end-rq>air and agarose gel fractionation procedures. 

The CviJ I** reaction represents a unique alternative for controlling 
the partial digestion of DNA, a technique which is fundamental to the construction 
of genomic libraries (Maniatis et al Cell 15:687-701 (1978), and restriction site 
mapping of recombinant clones (Smith, et al NucL Acids Res, 3:2387-2398 
(1976). Partial DNA digests are notably variable and are strongly dependent on 
the concentration and purity of the DNA, the amoimt of enzyme used, the 
incubation time, and the batch of enzyme. Partial digestions may also be variable 
with respect to the rate at which a particular recognition sequence is cleaved 
throughout the substrate. Optimal reaction conditions, such as those which render 
such partial digests independent of one or more of these variables, allows more 
precise control of the end product. Several controlling schemes may be 
employed, including: the addition of a constant amount of carrier DNA (Kohara 
et oZ., Cell 50:495-508 (1987)), the use of limiting amounts of Mg2+ (Albertson 
et al. NucL Acids Res. 17:808 (1989)), ultraviolet irradiation (Whitator, et al. 
Gene 41:129-134), and the combination of a restriction enzyme and a sequence 
complementary DNA metiiylase (Hoheisel et al., NucL Acids Res. 17:9571-9582 
(1989)). Utilizing three different batches of CviJ I, and three different DNA 
templates ftx)m five separate preparations, a imiform CviJ I** partial digestion 
pattern was obtained that was primarily time-dependent when a constant ratio of 
0.3 units of enzyme per fig of DNA was used. 

The rate at which a particular restriction site is cleaved at diffoent 
locations in a substrate is variable for many endonucleases (Brooks, et al.. 
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Methods in E^tzymol, 1^:113-129 (1987)). Reaction conditions for CviJ I may 
be optimized to substantially reduce the site preferences of this enzyme during 
partial digestion (see Figure 2, lanes 3 and 4). Normally, "star" reaction 
conditions result in cleavage at new sites. The use of star reaction conditions 
S according to the present invention (dimethyl sulfoxide [DMSO] and low^ned ionic 
strength) to affect the partial digestion activity of CviJ I**^ does not result in an 
altered restriction site cleavage as assayed by sequencing the products of 76 
digestion reactions. Instead, the relative rate of cleavage of individual sites 
appears to be more uniform under these conditions. A 3-5 fold increase in the 

10 rate of normal CviJ I restriction with the standard buffer and DMSO further 
substantiates this approach. All of these results indicate that, under the 
appropriate reaction conditions, CviJ I is useful for a number of other 
applications, such as high resolution restriction mapping and fingerprinting, 
diagnostic restriction of small PCR fragments, and construction of genomic DNA 

15 libraries. 

Another aspect of the present invention involves quasi-random 
fragmentation of DNA using the method of the present invention for epitope 
mapping and cloning intact genes. The same method as described above for 
shotgun cloning is utilized, except that an expression vector is used to generate 

20 functional proteins firom the DNA. 

Another aspect of the present invention involves fragmmting DNA 
using the present invention to generate multiple oligonucleotides from any double- 
stranded DNA template. Restriction-generated oligonucleotides (RGO) are 
sequence specific oligonucleotides generated from any DNA according to the 

25 present invention. CviJ I presumably cleaves the recognition sequence GC 

between the G and C to leave blunt ends (Xia et al,, NucL Acids Res. 15:6075- 
6090, (1987)). Because of the high frequency of dinucleotide GC in all DNA 
(16bp average fragment size for random DNA), a complete CviJ I* restriction 
results in numerous fragments which are about 20-200 bp in size. These 
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restriction fiagmeats are g»erated from an aliquot of the template itself and aie 
heat-denatured, to yield numerous single-stranded oligonucleotides which are of 
variable length but which are specific for the cognate template. Complete CviJ 
l"^ restriction of the small plasmid pUC19 (2689 bp) theoretically yields 314 
5 oligonucleotides after a heat-denaturation step. The ability to generate numerous 
oligonucleotides with p^ect sequence specificity is an unusual result of the use 
of this class of enzyme according to the presmt invention. Such oligonucleotides 
are imiqueiy suited for purposes of labeling DNA, as described below. 

One application of CviJ I restriction-generated oligonucleotides is 

10 to directly label them using conventional methods. There are several important 
advantages in using CviJ I* restriction-generated oligonucleotides. Convmtional 
methods employing S3mthetic oligonucleotides for detection purposes generally use 
one oligonucleotide containing one or a few labels. A complete CviJ I**" digest 
generates hundreds of oligonucleotides from a given template, depending on the 

IS size of &e template, and thus makes hundreds of sites available for labeling, 
regardless of the labeling scheme utilized. These hundreds of sequence specific 
restriction-generated oligonucleotides have two important advantages over 
conventional probes used in nuddc add detection methods. First, the generation 
of multiple oligonucleotide probes directed at multiple sites in a given target 

20 (theoretically, 314 sites in pUC19) provides enhanced detection s^isitivities 
compared to synthetic oligonucleotides which are directed at 1 or a few sites in 
a target. The numerous labeled restriction-genmtted oligonucleotides represent 
a 10-1(X) fold amplification of the signal for detection compared to the use of a 
single oligonucleotide. Second, the short length of the restriction-generated 

25 oligonucleotides permits more efficient hybridization. This is important for two 
reasons. First, hybridization times using restriction-generated oligonucleotides is 
reduced to 1 hr as opposed to an overnight incubation with conventional probes 
hundreds of nucleotides in length. This is a very important advantage when using 
oligonucleotide probes in clinical settings. Second, the penetration of probes into 
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penneabilized cells is a critical issue for in situ hybridization procedures. The 
smaller the probe, the easier the entry into the cell. Thus, the use of multiple 
oligonucleotide probes generated by the two base cutters greatly improves the 
srasitivity of in situ hybridization, a technique of considerable importance in 
research and clinical labs. Finally, when using membrane-based hybridization 
procedures, only small sections of a target nucleic acid are exposed and available 
for hybridization. Multiple oligonucleotides derived from a cognate template 
exhibit better detection sensitivities compared to long probes. 

Another application of restriction-generated oligonucleotides for 
labeling is to employ them as primers in a polymerase extension labeling reaction 
in conjunction with a rq>etitive thermal cycling regimen of denaturation, 
annealing, and extension. Thermal Cycle Labeling (TCL) is a method for 
efficiently labding double-stranded DNA while simultaneously ampfifying large 
amounts of the labeled probe. The TCL system employs the two base recognition 
mdonuclease CviJ I*^ to generate sequence-specific oligonucleotides from the 
template DNA itself. These oligonucleotides are combined with the intact 
template and subjected to repeated cycles of denaturation, annealing, and 
extension by a thermostable DNA polymerase from, for example, Thermus flavus. 
A radioactive- or non-isotopically-labeled deoxynucleotide triphosphate is 
incorporated during the extension stq> for subsequent detection purposes. The 
amplified, labeled probes represent a very heterogeneous mixture of fragments, 
which appears as a large molecular weight smear when analyzed by agarose gel 
electrophoresis. Primer-primer amplification, a side product of this reaction 
(produced by leaving out the intact template in the TCL reaction), may result in 
Kihanced detection sensitivity, perhaps by forming branched structures. Biotin- 
labeled probes generated by the TCL protocol detect as little as 25 zeptomoles 
(2.5 X 10'^^ moles) of a target sequence. A 50 /il TCL reaction yields as much 
as 25 /£g of labeled DNA, enough to probe 25 to 50 Southern blots. After 20 
cycles of denaturation and ^tension, biotin-dUTP-incorporated TCL probes may 
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be routinely detected at a 1:10^ dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
the probe is occurring. In addition, non-isotopically-labeled TCL probes exhibit 
a 10-fold improvement in detection sensitivity whm compared to SPL-geneiated 
S probes. ^^P-labeled probes generated by the TCL protocol may also detect as 
little as 50 zeptomoles (2.5 xlO'^^ moles) of a target sequence. As little as 10 
pg of template DNA is enough to synthesize 5-10 ng of radioactive version of 
TCL generates probes having extremely high specific activities, e.g. (about 5 x 
cpm/fig DNA), which permits 5 to 10-fold lower detection limits than 

10 conventional labeling protocols. 

There are several advantages to using restriction-generated 
oligonucleotides for primer extension labeling of DNA. One advantage is the 
specificity of the primers. All of the oligonucleotides generated by the TCL 
system are specific for the template utilized, unlike random primer labeling (RPL) 

15 which utilizes synthetic oligonucleotides &9 bases in length having a random 
sequence. The amount of prim^ required for efiident labeling with the TCL 
system is only 10 ng, compared to the 10 ftg of random primers utilized for RPL. 
Due to their short length, random primers anneal very ineffidentiy above 25- 
37^C, thus RPL is limited to DNA polymerases such as Klenow or T7. The size 

20 of the restriction-generated oligonucleotides are longer than the random primers, 
which extends the hybridization and extension conditions to include a wide variety 
of temperatures and polymerases. Thus, the use of the restriction-generated 
sequence-specific oligonucleotides results in more efficient hybridization and 
extension as compared to RPL. The TCL system has been optimized for labeling 

25 with a thermostable DNA polymerase which allows the option of temperature 

cycling. After 20 cycles of denaturation and extension, a significant amount of 
amplified TCL probes can be generated. Most importantiy, TCL-labeled probes 
exhibit a 10 fold improvement in detections sensitivity when compared to RPL- 
generated probes. 
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Another aspect of the present invention involves a variation of TCL 
called Univmal Tliennal Cycle Labelling (UTCL) in which the extension primers 
are not supplied by CviJI restriction, but rath^, aie found endogenously in the 
mzyme preparations of thermostable DNA polymerases. Random sequmce DNA 
S is usually co-purified along with the holo-enzyme prqiaration of the thermostable 
DNA polymerases, regardless of the source of the enzyme, i.e. native or cloned. 
However, only the holo-enzyme, and not the exonuclease minus deletion variants, 
contain the endogenous DNA. Typically, when the holo-enzymes of thermostable 
polymerases are used in protocols such as the polymerase chain reaction, the 

10 presence of such primers can create spurious results. Methods for circumventing 
the problems of endogenous DNA are described in PCR Protocols: A Guide to 
Methods and Applications, Eds. M. Innis, et al. , Academic Press, 1990. 

This residual DNA is rather short (approximately 5-25 bases), as 
assayed by end-labeling with 7^^P[ATP] and polynucleotide kinase and acts as 

15 endogenous "random" primers in a TCL-type reaction. UTCL combines the holo- 
enzyme of a thermostable polymerase from, for example, Thermus flams, with 
the intact DNA template and is subjected to repeated cycles of denaturation, 
annealing, and extension. A radioactive- or non-isotppically-labeled 
deoxynucleotide triphosphate is incorporated during the extension step for 

20 subsequmt detection purposes. The amplified, labeled probe represmts a very 
heterogenous mixture of fragments, which appears as a large molecular weight 
smear when analyzed by agarose gel electrophoresis. Biotin-labeled probes 
generated by the UTCL protocol detect as little as 25 zeptomoles (2.5 x lOr^O 
moles) of a target sequence, A 15 /il UTCL reaction yields as much as 5-10 fxg 

25 of labeled DNA, enough to probe 5 to 10 Southern blots. After 20 cycles of 
denaturation and extension, biotin-dUTF-incorporated UTCL probes may be 
routinely detected at a 1:10^ dilution, which is 1000 fold more sensitive than 
RPL, and indicates that a significant degree of net synthesis or amplification of 
tiie probe is occurring. In addition, non-isotopically-labeled UTCL probes exhibit 
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a 10-fold improvement in detection sensitivity when compared to RPL-generated 
probes. ^^P-labeled probes generated by the UTCL protocol may also detect as 
little as SO zeptomoles Q.S xlO"^^ moles) of a tai^et sequence. The radioactive 
version of UTCL generates probes having extremely high specific activities, e.g. 
S (about S X io9 q)m//ig DNA), which permits 5 to lO^fold lower detection limits 
than conventional labeling protocols. 

The presmt invention is illustrated by the following examples 
rdating to the isolation of a fiill length viral DNA clone encoding R.CV1JI, to the 
expression of R.CvOI DNA in E.coli strain DHSaF "MCR and to purification of 

10 R.CViJI from this bacterial stain. More particularly, Example 1 provides for the 
propagation of IL-3A virus and isolation of viral genomic DNA. Example 2 
addresses the improved expression of a clone for the viral methylase M.CViJI . 
Example 3 describes the strategy for isolating and cloning the viral R^CviJl gme 
by a forced co-cloning strategy of the M.CViJI gene. Example 4 describes the 

15 sequencing of cloned IL-3 A genomic DNA and identification of the R. CwJI gaie. 

Example 5 relates the methods for purification of CvOl to homogeneity firom an 
E.coli strain, DHSceF'MCR, transformed with a plasmid which encodes the 
R.C?W7I enzyme. Example 6 details the amino acid sequence analysis of the 
purified R.CVi7I enzyme. Example 7 desoibes the analysis of CvOI*" recognition 

20 sequences. Example 8 relates to a technique for producing restriction generated 
oligonucleotides using CVilI. Example 9 relates the gradation of anonymous 
primers using CvkTI. Example 10 describes end-labeling of CWJI restriction 
gmerated oligonucleotides. Example 11 describes primer extension labeling of 
DNA using restriction gen^ted oligonucleotides. Example 12 relates the use of 

25 CViTl in thermal cycle labeling of DNA as well as the method of imiversal thermal 

cycle labelling. Example 13 provides a method for generation of quasi-random 
DNA fragments using CVtll. Example 14 describes fractionation of CviJI digested 
DNA by size using spin column chromatography. Example 15 details the relative 
cloning efficiency of CvtJI digested, size-fractionated DNA by gel elution and 
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chromatographic methods. Example 16 describes the comparison of doning 
efficiency using lambda DNA fragmented by both sonicadon and CV£7I*"*' 
techniques. Example 17 details the use of CViTl'*'* fragmentation for shotgun 
cloning and sequencing. Example 18 describes the shotgun doning of lambda 
S DNA using CMJI. Example 19 describes the use of CviJl in epitope mapping 
techniques. Example 20 describes the restriction endonuclease reagent CGase I. 

Example 1 
Propagation of IL>3 A Virus 

The exsymbiotic Chlorella-Uke alga, NC64A, originally isolated 
10 from Paramecium bursaria (Karakashian, S.J. and Karakashian, M. W. , Evolution 

and Symbiosis in the Genus Chlorella and Related Algae. Evolution 19:368-377 
(196S)), was grown and maintained in Hold's basal medium (BBM), (Nichols, 
H.W. and Bold, H.C. J. Phycol 1:34-38 (1965)) modified by the addition of 
0.5% sucrose, 0.1% protease peptone, and 20 fig/ixd tetracycline (MBBA^. 
IS Cultures were innoculated with 1 XlO^ algae ceUs/ml and grown at 25°C in 250 
ml of MBBM in 500 ml Erlmmeyer flasks on a rotary shaker (150 rpm) in 
continuous light (ca. 30 fiEi^ m*^,sec~^). Growth was monitored by light 
scattering measured as ^S40nm ^^ot by direct cell counts with a 
hemocytometer. 

20 When the cultures reached approximately 1 X 10^ algae cells/ml 

they were innoculated with filter sterilized (0.4 /im nitrocellulose filter, 
Nucleopore, Pleasanton, California) IL-3A virus at a multiplicity of infection of 
0.01 and incubated for an additional 48 - 72 hours at 25°C. The crude lysate was 
then centrifuged at 30(X) rpm (2(XX) xg) for 10 minutes to remove cellular debris. 

25 Nonidet P-40 was then added to 1 % (v/v) and the virus was pelleted from the 
supernatant by centrifuging at 15,000 rpm at 4°C for 75 minutes in a Beckman 
No. 30 rotor. The viral peUet was gently resuspended in 0.05 M Tris-HCl, pH 
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7.8, and the sample was layered on linear 10 - 40% sucrose gradients equilibrated 
with 0.05 M Tris-HCl, pH 7.8, and centrifiiged for 20 minutes at 20,000 ipm at 
4^C in a Beckman SW28 lotor. The viral band, which was present in the center 
of tiie gradient as an opaque band, was removed, diluted with 0.05 M Tris-HCl, 
5 pH 7.8, and pelleted by oentrifiigation at 15,000 rpm at 4°C for 120 minutes in 
a Beckman No. 80 rotor. The virus was resuspended in a small volume (10ml) 
of 0.05 M Tris-HCI, pH 7.8, and stored at 4°C. 

IL-3A viral DNA was purified from the viral particles using a 
modification of the protocol described by (Miller, S.A., Dykes, D.D., and 

10 Polesky, H.I., Nucleic Acids Res. 16:1215 (1988)). Briefly, 100 yl of IL-3A 
virus (9.8 X 10^^ plaque forming units/ml) was diluted with 400 ijI of water and 
then mixed with 10 ^\ TEN (0.5 M Tris-HCl, pH 9.0, 20 mM EDTA, 10 mM 
NaCl) and 10 /J of 10% SDS. After incubating at 70^C for 30 minutes the 
solution was extracted twice with phenol-chloroform-isoamyl alcohol, extracted 

15 once with chloroform, and precipitated with ice-cold ethanol using methods wdl 
known in the art and resuspended in 500 id of H2O. (Ausubel, F.M., Brent, R., 
Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A. and Struhl, K. (Eds.) 
(1987) Current Protocols in Molecular Biology, Wiley, New York; Sambrook, J., 
Fritsch, £.F. and Maniatis, T. (1989), Molecular Cloning: A Laboratory Manual, 

20 Cold Spring Haibor Laboratory Press, Cold Spring Harbor, New York). 

Example 2 
CviJI Metfayitraiisf erase Clone 

The CVai methyltransferase gene (M.CVfll) from Chlorella virus 
IL-3A was cloned and sequenced by Shields et al.. Virology 176:16-24 (1990). 
25 Briefly, SauSA partial digest of Chlorella virus IL-3A was ligated to BamBI 

digested pUC19 and transformed into E. coli strain KRl . This library of plasmids 
was restricted with HindSL (AAGCTT) and Sstl (GAGCTC), both of which are 
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inhibited by S-methylcytidine (SmC) in the AGCT porticm of their recognition 
sequmceSy and transformed again into RRl cdls. M.CviJI methylates the internal 
cytidine in (G/A)GC(T/C/G) sequences. If the M.CVflfl gene is doned and 
e7q>ressed appropriately, the plasmid DNA would be expected to be resistant to 
5 Hindni and Sstl restriction. 

The CwJl methyltransferase gene was originally cloned as a 7.2 kb 
insert, termed pIL-3A.22. Plasmid pIL-3A.22 was only partially resistant to CwJI 
digestion. Partial digestion is most likely due to the inefficient expression of the 
M.CwJI gene and the numerous CWJI sites in both the vector (pUC19 has 45 

10 CvzJI sites) and in the insert DNA. The M. CvVSl gene was eventually sublocalized 

to a region of 3.7 kb by subcloning using methods well known in the art 
(Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D,, Seidman, J.G., Smith, 
J. A. and Struhl, K. (Eds.) (1987) Currem Protocols in Molecular Biology ^ Wiley, 
New York; Sambrook, J., Fritsch, E.F. and Maniads, T. (1989), Molecular 

IS Cloning: A Laboratory Manual^ Cold Spring Haibor Laboratory Press, Cold 

Spring Harbor, New York ) and testing the subcloned DNA for 
sensitivity/resistance to Hindm, Sstly and CvzJI. (Shields et al.j supra) The 
entire sequence was determined and three open reading frames which could code 
for polypeptides 161, 367, and 162 amino acids, respectively, v/ete identified. 

20 The 367 amino acid open reading frame (ORF) was idratified as the M.CVzn gme 
by three criteria: (i) it is the only ORF located in the region idmtified by 
transposon mutagenesis; (ii) it has amino add motifs similar to those of other 
cytosine methyltransf erases; and (iii) a 1.6 kb Drdl fragment containing the 367 
amino acid ORF (1101 bp) produces the methyltransferase. This 1.6 kb M.CVrJI 

25 encoding fragment was subcloned into the EcoRW site of pBluescript KS(-) 

(Stratagene, LaJoUa, CA), in the same translational orientation as the lacZ' gene 
of this vector. A physical map of the resulting plasmid termed p710 is shown in 
Figure 1. 
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The plasmid p710 was digested with several endonucleases to 
indirectly test the efficiency of M. CVzII opiession. Fully active mediylase shoiild 
raider the plasmid DNA completely resistant to digestion by tiie following 
mzymes: HaeHl (which recognizes the sequence GGCC), Sacl (which recognizes 
5 tiie sequence GAGCTC), and HindHL (which recognizes the sequrace AAGCTT). 
The plasmid was partially resistant to HaelR (90%) and Sacl ^%), and evm less 
resistant to Hirunn (25%) digestion. This lack of complete protection of the 
plasmid DNA made it impractical to attempt cloning the three/two base restriction 
endonuclease encoded by the R.CwJI gene. Thus, improvements in the efficiency 

10 of M.CwJI expression were required before attempting to clone the R. CVlTI gene. 

The translation efficiency of the M.CViJI gene was improved by 
removing extraneous 5 ' open reading frames, creating a perfect fusion of the 
lacZ ' Shine-Delgamo sequence with the methyltransferase start codon (see Figure 
1). This was achieved by site-specific oligonucleotide mutagenesis, using the 

15 oligomer 

5 '-CAATrrCACACAGGAAACAGCTATGTCTTTrCGCACGTrAGAAC-3 ' 
(SEQ ID NO: 1) to precisely remove the intervening lacZ'' DNA. The relevant 
DNA sequences are indicated in Figure 1 (SEQ ID NO: 12). The mutagenesis was 
facilitated by converting the double stranded plasmid DNA of p710 to single- 

20 stranded DNA by co-infecting the £. coU host strain with the helper phage R408 
(Russel, M., Kidd, S. and KeUy, M.R. Gene 45:333-338), using methods weU 
known in the art. The mutagenesis reaction was completed using a commercially 
available kit according to the manufacturer's instruction (Mutagene, Bio-Rad, 
Hercules, California). The oligonucleotide was annealed to the single-stranded 

25 plasmid, extended in the presence of T4 DNA polymerase, ligated using T4 DNA 
ligase, and transformed into competent SURE™ cells (Stratagene, La JoUa, 
California). Transformed cells were then grown overnight as a pool, the DNA 
isolated and purified. 
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Enrichment for the mutagenized plasmids was made possible by 
virtue of the loss of an Xhol site located in the sequence that was deleted by 
mutagenesis. Enrichment was accomplished by digesting the isolated, purified 
plasmid DNA with Xholt followed by dephosphorylation with calf intestinal 
5 alkaline phosphatase (CIAP), and transformed into SURE cells. Plasmid DNA 
was isolated from 18 individual colonies and the DNA tested for resistance to 
Xhol. Plasmid DNA from 1 1 colonies were resistant to Xhol digestion, indicating 
that they lacked the deleted sequence. Five of these plasmids were restricted with 
HaeUL, HindUL, PvuU (which recognizes the sequence CAGCTG), and CviJI. All 
10 five appeared 100% resistant to these enzymes. Four of the plasmids were 
sequenced and the deletion was confirmed as being correct. One of these, 
pBMC5, was chosen for further modification. 

Example 3 
Forced Co-Cloning of R.CV£|I 

IS The location of the R.CvlTI gene on the IL-3A virus genome was 

inferred as being 3' to the M.CViJI gene for two reasons: 1) the cloned DNA 
sequence to the M.CVfJI gene did not produce a restriction activity; and 2) 
several attempts to clone the DNA 3' to the M.CviJI gene resulted in 
deletions/rearrangements of this downstream region. This information permitted 

20 a forced co-doning strategy to obtain the restriction endonudease gene. This 
strategy uses a deletion derivative of pBMCS lacking the 3 ' half of the M.CvflI 
gene. Digestion of the IL-3A genome with the same enzyme used to create the 
M.CviJI deletion, followed by ligation of the respective DNAs, transformation, 
and digestion with enzymes incapable of recognizing methylated DNA (e.g., 

25 HaeJH, HindllL, PmU, CVJI, etc.) should force the selection of clones which 
have a restored M.CViJI gene (and thus active methylase razyme), as weU as 
downstream DNA. Thus, if a clone is found to be CvOI resistant, the 3 ' half of 
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10 



15 



20 



M.CTvzJI must have been restored, and downstream DNA containing the 'R.CvUl 
gene» at least in part, would presumably be cloned. 

The details of this cloning strategy are as follows. pBMCS has two 
EcdRl sites, one approximately in the middle of the M.CVz7I gene, while the other 



restricted with £c(7RI and ligated at a dilute concentration (10-50 ngZ/tl) to favor 
circularization without the 3 ' M.CVzJI fragment. The reaction mixture was then 
transformed into competent SURE cells and plated on TY agar containing 
ampiciUin. Plasmid DNA from the resulting colonies was tested for the lack of 
this EcoBl fragment by digestion with EcoRI. One of these clones, pBMCSRI, 
was used for the subsequent co-cloning work. Plasmid pBMCSRI was digested 
with EcoRl and dephosphorylated using CIAP. IL-3A genomic DNA was then 
digested to completion with EcoKL, The EcoBl digested pBMCSRI and IL-3A 
DNAs were combined at a ratio of 1:3 in a ligation reaction using T4 DNA 
ligase, and the products of the ligation reaction were subsequently used to 
transform competait SURE cells. The pBMC5RI/IL-3A transformants were not 
plated, but rather grown overnight in culture as a library or pool of cells. The 
cells were harvested the next day and DNA was isolated and purified. Isolated, 
purified DNA was digested with HaeJR^ dephosphorylated with CIAP, and 
transformed into competent SURE cells. The cells were then plated and grown 
overnight. Six colonies grew, of which only one containing the plasmid, 
pCJHl.4, was resistant to HaeJU. The plasmid pC7H1.4 was found to encode 
CVOI restriction activity. Plasmid pCJHl.4 was further characterized to localize 
the gene for CViJI by deletion analysis, subcloning experiments, and sequencing. 
The plasmid pCJHl.4 was deposited with the American Type Culture Collection 
on June 30, 1993 under Accession Number 69341, 



site lies in the vector DNA, 3 ' to the M.CWJI gene (see Figure I). pBMCS was 



wo 94/21663 PCT/US94/03246 



-32- 
Example 4 

Sequencing of Cloned IL-3A DNA Containing CtiJI Gene 

The EcoBl fragment cloned into pCJHl .4 (as described in Example 
3) is 4901 bp in length. Exc^t for the 519 bp corresponding to the 3 ' portion 
5 of the M.CwzJI gene, the remainder of the 4901 bp EcoR I fragment cloned into 
pCJHl.4 was sequenced using the SEQUAL DNA Sequencing System 
(CHIMERx, Madison, WI) by methods well known in the art. Sequencing was 
accomplished using three approaches: 1) primer walking on pCHJl.4, 2) cloning 
various restriction endonuclease digests of pCHJl.4 into an M13 type sequencing 

10 vector; and 3) sequencing various restriction endonuclease deletion derivatives of 
pCHJl.4. The nucleotide sequence of 5497 bp of IL-3A viral DNA is shown in 
Figure 2 and set forth in SEQ ID NO.: 2. 

Six open reading frames (ORF) of 1155 bp (ORFl), 468 bp 
(ORF2), 555 bp (ORF3), 1086 bp (ORF4), 397 bp (OIUF5) and 580 bp (ORF6) 

15 which could code for polypq>tides containmg 358 (41.4 kD), 156 (19.4 kD), 185 

(20.3 kD), 362 (38.9 kD). 132 (14.5 kD) and 193 (21.9 kD) amino acids, 
respectively, were identified (see Figure 3). ORFs 4-6 do not code for the 
R.CVzJI gene, as the deletion derivative pCdA12, which lacks the DNA between 
the Aval and BaniHl sites (see Figure 3), does produce CviJI restriction 

20 endonuclease activity. In addition, the deletion dmvative pCdEB7, lacking the 
DNA between the EcdRl and BamHI sites, did not produce CViJI activity. Thus 
ORFl or ORF3 were the most likely candidates for encoding the R.CviJI gene. 
The sequence of the 1155 bp ORFl (SEQ ID NO: 3), its deduced amino acid 
sequence (SEQ ED NO: 4) (as shown in capital letters), plus flanking bases, is 

25 presented in Figure 4. The vertical line in Figure 4 and the associated arrow 
indicate where the DNA sequence from pJCHl.4 diverges from that of pIL- 
3A.22-8 (Shields. S.L., et al.. Virology 76:16-24, 1990). This open reading 
(ORFl) frame is believed to represent the CvHl gene because 14 out of 15 N- 
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terminal amino acids from the protein sequence (see Example €) matched the 
predicted translation product of the nucleic add sequence (Figure 4). Also, the 
32.5 kD molecular wdght of the homogmeously purified enzyme described in 
Example S matched the predicted translation product of the nucleic add sequence 
5 (31.6 kD) if the encoded protein was translated beginning at the GTG codon 
located at nucleotides 299 - 301 (Figure 4), instead of the 5 ' ATG codon located 
at nucleotides 59-61. This possibility is not surprising in light of the fact that 
approximately 10% of prokaryotic and eukaryotic gene products begin translation 
with a GTG start codon, rather than the usual ATG codon (Kozak, M.y Microbiol 

10 Rev, 47:1-45 (1983); Kozak, M. J.CelLBioL 108:229 (1989); Gold, L. et d., 
Annu.Rev,MicrobioL 35:365-403 (1981)). The structural gene was identified to 
be 834 nucleotides in length, coding for a protein of 278 amino acids (31.6 kD) 
and is set forth in SEQ ID NO: 4. It is also interesting to note that the CvHl gme 
was shown to possess an in-fcame, upstream ATG codon which if translated could 

15 yidid a protein widi a predicted molecular weight of 41.4 kD (Figure 4). A larger 
molecular wdght form possessing CwOI restriction activity has not been detected 
by SDS gel dectrpphoresis. However, a second peak of CvzTI activity whidi 
eluted separatdy from the 32.5 kD form was detected in the initial stages of 
enzyme purification. The DNA sequence which could Aeoretically code for a 

20 larger form of CviSl would be approximatdy 1074 nucleotides in length (assuming 
it starts at the upstream ATG codon) and would code for a protein of 358 amino 
adds. 

Example 5 

Puriflcatioii of Recombinant Cvijl Restriction Endonudease 

25 Initially, 20 ml of LB medium (plus 100 pLglnA ampicillin) were 

inoculated with a 1 ml stock of £. coli transformed with the plasmid pCJHl.4 
described above and grown overnight at 37°C with shaking. The next day, 20 ml 
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of this initial overnight culture was used to inoculate another 1 liter of LB 
medium and grown ovmight. The following day, SO liters of TB medium (12 
g Bacto-Tryptone, 24 g Bacto Yeast Extract, 4 ml glycerol, 2.31 g KH2PO4, 
12.54 g K2HPO4, 0. 1 g MgS04, 100 /ig/ml ampicillin, and water to 1 liter) were 
5 inoculated with an aliquot of the secondary overnight culture and grown at 37^C 
with 20 liters/min aeration at 200 KPM, until the 00595^,^ reached 1.0 unit. 
Vigorous aeration was essential for CvzJI escpression and a typical yield contained 
70 g of cell paste after centrifugation. 

The cell pellet was immediately resuspended in lysis buffer A 

10 (30 mM Tris-HCl, pH 7.9 at 4°C, 2 mM EDTA, 10 mM beta-mercaptoethanol, 

50 /ig/mlphenylmethylsulfonyl fluoride (PMSF), 20 ^g/ml benzamidine, 2 ^g/mi 
0-phenantroline, 0.7 /xg/ml pq)statin) at a volume of 3 ml of buffer A per 1 g of 
cells. The cell suspension was then passed through a Manton-Gaulin cell 
disrupter (Gaulin Corporation, Evmtt, MA) twice and centrifiiged for 1 hr (8000 

15 KFM, Sorvall GS3 Rotor) at 4^C. To the supernatant, solid NaQ was added to 
a final concentration of 200 mM, and 10% polyethyleneimine (PES) solution 
slowly added to a final concentration of 1 %. The mixture was stirred for 3 hr, 
and then centrifiiged 30 min, at 4'^C, 8000 RPM (Sorvall GS3 Rotor). Solid 
ammonium sulfate was then added to the supernatant at 0.5 g/ml and the mixture 

20 was stirred overnight at 4^C. The precipitated proteins were centrifiiged for 1 hr. 
(8000 RPM, Sorvall GS3 Rotor) at 4^C and the resulting pellet dissolved in 
100 ml of buffer B (10 mM K/PO4, pH 7.2, 0.5 mM EDTA, 10 mM beta- 
mercaptoethanol, 50 mM NaCl, 10% glycerol, 0.05% Triton X-100, 50 fig/nd 
PMFS, 20 fig/nd benzamidine, 2 ^g^^l o-phenanthroline, 0.7 /xg/ml pepstatin). 

25 The dissolved protein solution was then dialysed (14kD cut-off) for 12 hours 
against three 1 liter changes of buffer B. The dialyzed solution was then diluted 
to 600 ml with buffer B and applied to a 5 x 20 cm phosphocellulose Pll 
(Whatman) column (flow rate 100 ml/hr). 
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The column was then washed with l.S liter of buffer B followed 
by a 0 - l.S M NaCl gradient in buffer B (S liters). R.Cvin eluted at 
appraximately 600 NaCl. The active fractions were then pooled and 
concentrated to SO ml with a 76 mm Amicon YMIO membrane. The resulting 
S solution was then diluted to 300 ml with buffer C (20 mM Tris-acetate, pH 7.4 
at 4°C, 2 mM EDTA, 10 mM beta-mercaptoethanol, 50 mM NaCl, 10% 
glycerol, 0.01% Triton X-100, 50 fig/nd PMFS, 20 fig/iol benzamidine, 2 fig/ml 
o-phenanthroline, 0.7 fxg/ml pq^statin) and applied to 2.5 x 7 cm Hq>arin- 
Sq)harose column at a flow rate of 25 ml/hr. 

10 After a 400 ml wash with buffer B, R-CwJI was eluted with a 

1.5 liter gradient of 0 - 1.3 M NaCI in buffer C. CvUl eluted at approximately 
400 mM NaCl. The most active fractions were pooled and applied to a 
2.5 X 7 cm Blue-agarose column equilibrated in buffer D (20 mM Tris-acetate pH 
8.0, 1 mM EDTA, 7 mM beta-mercaptoethanol, 30 mM NaQ, 10% glyconol, 

15 0.01% Triton X-100, 50 fie/wl PMFS, 20 fig/nd benzamidine, 2 /xg/ml 
o-phenantfaroline, 0.7 fig/tvl pepstatin). After a 500 ml wash with buffer D, CvfJI 
was eluted with a 0 - 1.5 M NaCl gradient (1.5 1) in buffer D. Active fractions 
were dialyzed against buffer G (10 mM K/P04 pH 7.0 (4°C), 10 mM beta- 
mercaptoethanol, 50 mM NaQ, 10% glycerol, 0.01% Triton X-100, 50 fig/nH 

20 PMFS, 20 fcg/ml benzamidine, 2 ftg/ml o-phenanthroline, 0.7 fig/ml pepstatin) 
and loaded (20 ml/h) onto a ceramic HTP column (American International 
Chemical, Natick MA) (1.5 x 3 cm), equilibrated in buffer F (20 mM Tris-HCl 
pH 8.0, 0.5 mM EDTA, 3 mM DTT, 50 mM K-acetate, 5 mM Mg acetate, 50% 
glycerol). After washing with 100 ml of buffer F, a 400 ml gradient 0 - 0.9 M 

25 K/PO4 in buffer F was run. The HTP colunm was washed with buffer G, 
containing 3 mg/ml BSA, then with 1 M phosphate buffer and reequiUbrated in 
buffer G. The active fractions were then pooled and concentrated using a TMIO 
membrane to a final volume of 3 - 4 ml. This concentrate was then applied to a 
2.5 X 95 cm Sephadex G-lOO column, equilibrated in buffer E (20 mM Tris-HCl 
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pH 7.5 (4®C), 5 mM Mg-Acetate, 2 mM EDTA, 10 mM beta-mercaptoethanol, 
100 mM NaCl, 5% glycerol, 0.01% Triton X-100, 50 fig/wl PMFS, 20 §ig/mi 
benzanudine, 2 ftg/nd o-phenanthioline, 0.7 ftg/ml pepstatin) at a flow rate of 
6 ml/hr, and 3 ml fractions collected. Active fractions were dialyzed against 

5 storage buffer F. 

The molecular weight of the purified CwJI was determined by 
comparison to known protein standards on a denaturing 10% SDS polyacrylamide 
gel and a single band migrating with an apparent molecular weight of 32.5 
Idlodaltons was seen indicating that by these criteria, CvUl was purified to 

10 homogeneity. 

Example 6 

N-Terminal Amino Acid Sequence of R.CnII 

To confirm that the restriction endonudease encoded by the insert 
in pCJHl.4 was CVOI the sequence of tiie first IS N-terminal amino adds of 
15 purified CvfJI was determined by the Edman degradation method using an Applied 
Biosystems (Foster C5ty, CA) 477A Liquid Phase Protein Sequencer with an on- 
line 120A PTH Analyzer. The results of that analysis are shown in Table 1. 
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Tablel 

N-Terminal Amino Acid Analysis of CviJI 



kinino 
Lcid # 


Retention 
Time 
(min) 


pmol 


Pmol 


Pmol 


Pmol 
liatin 


Amino Acid ID 


1 


9.17 


6.11 


3.86 


5.10 


34.53 


THR, MET, 
ARG, OR LYS 


2 


10.32 


3.92 


1.54 


1.82 


9.96 


GLU 


3 


10.33 


4.28 


2.22 


2.18 


11.96 


GLU 


4 


27.37 


2.23 


1.49 


1.72 


7.64 


LYS 


5 


27.35 


2.37 


1.66 


1.67 


7.39 


LYS 


6 


17.95 


3.37 


2.76 


2.81 


9.48 


ARG 


7 


28.10 


3.19 


1.73 


2.08 


6.09 


LEU 


8 


13.58 


3.58 


2.11 


2.49 


12.08 


ALA 


9 


28.10 


3.23 


1.68 


1.58 


4.63 


LEU 


10 


18.17 


0.71 


0.78 


0.36 


1.21 


ILE 


11 


10.30 


1.65 


0.78 


0.96 


5.26 


GLU 


12 


9.72 


8.03 


0.41 


1.31 


3.25 


LYS 


13 


8.53 


1.54 


0.53 


0.55 


2.97 


GLN 


14 


18.18 


2.19 


1.74 


1.67 


5.63 


ARG 


15 


26.80 


3.33 


0.43 




0.89 


ILE 



20 Abbreviations used: threonine (THR), methionine (MET), arginine (ARG), lysine 
(LYS), glutamic acid (GLU), leucine (LEU), alanine (ALA), isoleudne (DIE) and 
glutamine (GLN). 



The results of this analysis confirm that the protein encoded by the 
DNA insert in pCJHl.4 (ORFl) is CviH. 
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The following Examples illustrate some of the unique properties of 
and important uses for CviJI. 

Example 7 

Analysis of CviJI* Recognition Sequences 

5 The CViJI* recognition sequence (see Xia, et al,, Nuc. Acids Res. 

15: 6025-6090, 1987) was deduced by cloning and sequencing CviJI* digested 
pUC19 DNA fragments. A complete CvzJI* digest of pUC19 was ligated to an 
M13mpl8 cloning derivative for nucleotide sequence analysis. The sequrace of 
the entire insert was read in order to determine which sites were or were not 
10 utilized. A total of 100 clones were sequenced, resulting in 200 CviJI* restricted 
junctions, the data for which are compiled in Table 2. 
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The dinucleotide GC is found at 205 sites in pUC19. These GC 
sites (shown in Table 2) can be divided into four classes based on their flanking 
Pu/Py structure^ the normal recognition sequmce (N) and three potmtial classes 
of relaxed sites (S2 and R3). As seen in Table 2, the fraction of such NGCN 
S sites which belong to each classification is roughly equal (22.0%-27.8%). A total 
of 200 CVlTT* restricted junctions were analyzed by sequencing 100 cloned inserts. 
If CViJI* cleaved at all NGCN sites without sequence preferences, it would be 
expected that the fraction of each classification should be restricted approximately 
equally. Instead, most of the sites cleaved by this treatment were found to be 

10 normal, or PuGCPy sites (47.5%). Rl (PyGCPy) and R2 (PuGCPu) restricted 
sites were found at nearly the same frequency (25.5% and 27.0%, respectively). 
Out of 200 CwJI* junctions, no R3 (PyGCPu) restricted sites were found. Thus, 
CwJI* cleaves all NGCN sites except for PyGCPu. As CviJI* cleaves 12 out of 
16 possible NGCN sites, it may be referred to as a 2.25-base recognition 

IS endonudease. 

In addition to the restricted sites, those sites which were not cleaved 
by CvUl* conditions were also compiled for analysis, as shown in Table 2. A 
total of 116 non-cleaved NGCN sites were found in the 100 inserts which were 
sequenced. PyGCPu sites represented the largest class of non-cleaved sites 
20 (52.6%). In only two cases v/m PuGCPy sites found not to be cleaved. An 
approximately equal fraction of Rl and R2 sites were not cleaved as were found 
cleaved (22.4% versus 25,5% for Rl and 23.3% versus 27.0% for R2). Based 
on the frequency of cleavage, or lack thereof, a hierarchy of restriction under 
CviJI* conditions is evidoit, where PuGCPy > > PuGCPu = PyGCPy. 
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Example 8 

CvO^ Restriction Generated Oligonucleotides 

Due to tiie high frequency of CViJI or CVlTI'*' restriction, it is 
possible to gmezate useful oligonucleotides by digestion and a heat denaturation 
S stqp as described above. The size and number of the resulting oligonucleotides 
are important for subsequent applications such as those described above. If for 
example, an oligonucleotide is to be used with a large genome, it has to be long 
enough so that the sequence detected has a probability of occuring only once in 
the genome. This minimum length has been calculated to be 17 nucleotides for 

10 the human genome (Thomas, C.A., Jr. Prog. NucL Acid Res. MoL Biol., 5:315 
(1966)). Oligonucleotides used for sequencing or PGR amplification are generally 
17-24 bases in length. Oligomers of shorter length will often bind at multiple 
positions, even with small genomes, and thus will generate spurious extension 
products. Thus, an en^matic method for generating oligomers should ideally 

15 result in polymers greater than 18 bases in length. 

The theoretical number of pUC19 CVOI* restriction-generated 
oligomers is 314 (157 CviJI* restriction firagmmts x 2 oligomers/fragment), the 
size distribution of which is shown in panel A of Figure 5. Most of the «pected 
CVJI* restriction-generated oligomers (about 75%) are smaller than 20 bp. This 

20 assumes that CvHl is c^iable of restricting DNA to very small firagmmts, the 
shortest of which would be 2 bp. However, in practice, about 93% of the cloned 
CvHl^ fragments were 20-56 bp in size, and 3% of the fragments generated by 
CWJI* were smaller than 20 bp (p2nel B of Figure 5). This suggests that CViJI* 
is not able to bind or restrict those fragments below a certain threshold length. 

25 Since the smallest observed fragment is 18 bp, it may be assumed that this length 

is the minimal size which can be generated from a given larger fragment. 
Whatever the reason for this phenomenon, CviJl^ treatment of DNA produces a 
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rdativdy small range of oligomers (mostly 20-60 bases in length), most of which 
are a perfect size class for molecular biology applications. 

Example 9 
Anonymous Primer Cloning 

5 Primers are critical tools in many molecular biology applications 

such as PCR, sequencings and as probes. Anonymous primers are useful as 
sequencing primers for genomic sequencing projects, as probes for mapping 
chromosomes, or to generate oligonucleotides for PCR amplification. 

The Anonymous Primer Cloning (APC) method is a variation of 

10 shotgun cloning in that unknown sequences of DNA are being randomly cloned. 

However, imlike CwJI shotgun cloning, wherein a partial CViJI** digest of DNA 
is cloned, anonymous primer cloning utilizes a complete CviSl^ digest to restrict 
large DNAs into small fragments 20-200 bp in size. These small fragments are 
cloned into a unique vector deagned for excising the anonymous DNA as labeled 

15 primers. The strategy for this method is illustrated in Figure 6. 

As illustrated in Figure 6, the APC strategy reduces large DNAs 
to small fragments, which are cloned and excised for use as primers. Plasmid 
pFEM has a unique arrangemmt of the restriction sites for MboU and FokL^ which 
permits DNA cloned into the EcdRV site to be excised without associated vector 

20 DNA. This is possible because Fold cleaves 9/13 bases to the left of the 
recognition site shown in pFEM and MboU cleaves 8/7 bases to the right of the 
recognition site shown in pFEM, which is well into the cloned anonymous 
sequence. After MboU or Fold restriction, a known flanking primer is annealed 
(primer 1 or 2) and extended using a DNA polymerase and dNTP.^. Thp. nnnwr 

25 is previously end-labeled, or alternatively, one or more 
radioactive. 
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After doiaturadon of the newly synthesized DNA and separation 
from its cognate template, the labeled anonymous primer is ready for use in 
sequmcing the original template from which it was subcloned. The presrace of 
the pFEM vector sequence fiised to the anonymous sequence does not influence 
S the enzymatic extension of this prim^ from its unique binding site, as the vector 
DNA is at the 5' end and the unique sequoice is located at the 3' end (all 
polymerases extend 5' to 3')- Both the top and bottom strand primers may be 
excised from pFEM due to the symmetrical placement of restriction sites and 
flanking primer binding sites. Thus, two primers may be derived from each 
10 cloning event. APC is particularly well suited to the genomic sequencing strategy 

of Church and Gilbert Proc Natl Acad ScL USA 81:1991-1995 (1984), although 
its utility is not limited thereto. 

Excample 10 

End Labeling of Restriction-Generated Oligonucleotides 

IS As is dear from the foregoing examples, digesting DNA with 

CviJI provides the ability to generate sequence-specific oligonucleotides ranging 
in size from 20-200 bases in length with an average length of 20-60 bases. 
Sequence specific oligonucleotides generated by CviJl* digestion may be labeled 
directly at the S'-end or at the 3'-end using techniques well known in that art. 

20 For example, 5 '-end labeling may be accomplished by dither a 

forward reaction or an exchange reaction using the oizyme T4 polynucleotide 
kinase. In the forward reaction, ^^P from [-y^^P] ATP is added to a 5' end of an 
oligonucleotide which has been dephosphorylated with alkaline phosphatase using 
standard techniques widely known in the art and described in detail in Sambrook 

25 et al.. Molecular Cloning: A Laboratory Manual, 2nd Edition. Cold Spring 

Harbor Laboratory Press (1989). In an exchange reaction, an excess of ADP 
(adenosine diphosphate) is used to drive an exchange of a S '-terminal pho^hate 
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from the sequence specific oligonucleotide to ADP which is followed by the 
transfer of ^^P from t^^P-ATP to the 5'-«id of the oligonucleotide. This 
reaction is also catalyzed by T4 polynucleotide kinase and is decribed in 
Sambrook et aL , Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold 
S Spring Harbor Laboratory Press (1989). 

Homopolymeric tailing is another standard labeling technique useful 
in the labeling of CV£JI*-generated sequence specific oligonucleotides. This 
reaction involves the addition of ^^P-labeled nucleotides to the 3'-CTd of the 
sequence specific oligonucleotides using a terminal deoxynucleotide transferase. 

10 (Sambrook et aL , Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold 

Spring Harbor Laboratory Press (1989)). 

Commonly used labeling techniques typically employ a single 
oligonucleotide directed to a single site on the target DNA and containing one or 
a few labels. Oligonucleotides gmerated by the method of the present invention 

IS are directed to many sites of a target DNA by virtue of the fact that they are 
generated from a sample of the target sequence. Thus, the hybridization of 
muldple oligonucleotides (lab^ed by the methods described above) allows a 
significantly rahanced sensitivity in the detection of target sequences. In addition, 
the short length of the labeled oligonucleotides used in the methods of the present 

20 invention allows a reduction in hybridization time from overnight (as is used in 
conventional methods) to 60 mins. 

Although labeling sequence specific oligonucleotides with ^^P is 
described above, labeling with other radionucleotides, and non-radioactive labels 
is also within the scope of the present invention. 
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Example 11 
Primer Esctension T^abeling of DNA Usmg 
Restriction-Generated Oligonucleotides (PELrRGO) 

Another aspect of the present invention includes methods for 
S labeling DNA wliich include the gmmtion of oUgonucleotide piimers by 
complete digestion with CwJI*, followed by heat denaturation. PEL-RGO 
requires three steps: 1) generating the sequence-specific oligonucleotides by CViJI 
restriction of the template DNA; 2) denaturation of the template and primer; and 
3) primer extension in the presence of labeled nucleotide triphosphates. Plasmid 

10 DNA may be prepared by methods known in the art such as the alicaline lysis or 
rapid boiling methods (Sambrook et al.. Molecular Cloning: A Laboratory 
Manual, 2nd Edition). Cold Spring Harbor Laboratory Press, Cold Spring 
Haibor, New York (1989)). In addition, the vector should be linearized to ensure 
efTective dmaturation. A lestricdon fragment may be labeled after separation on 

IS low melting point agarose gds by methods well known in the art. 

In PEL-RGO labeling, template DNA to be labeled is divided into 
two aliquots; one is used to generate the sequence specific oligonucleotide primers 
and the oth^ aliquot is saved for the primer annealing and extension reaction. 
A typical reaction mix for generating sequence-specific oligonucleotides is 

20 assembled in a microcehtiifuge tube and includes: 100 ng DNA; 2 fdSx CvoC 
buffer; 0.5 ^ CVOI (lu/fd); sterile distilled water to 10 ii\ final volume. CvflC 
5X restriction buffer includes: 100 mM glycylglydne (Sigma, St. Louis, 
Missouri, Cat. No. G226S) pH adjusted to 8.S with KOH, 50 mM magnesium 
acetate (Amresco, Solon, Ohio, Cat. No. POO 13 119), 35 mM /8-mercaptoethanol 

25 (Mallinckrodt, Paris, Kentucky, Cat. No. 60-24-2), 5 mM ATP, 100 mM 
dithiothreitol (Sigma, St. Lous, Missouri, Cat. No. D9779) and 25% v/v DMSO, 
(Mallinckrodt Cat. No. 67-68-5). CvzJI is obtained from CHIMERx (Madison, 
Wisconsin). The reaction mix is incubated at 37°C for 30 min, followed by the 
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inactivation of CvUl by heating at 6S^C for 10 inin. The CviJl'^-restricted DNA 
may be used directly without further purification, or it may be stored at -20°C for 
several months for subsequent labeling reactions. 

After heat-inactivating CWJI, 0.2 fig of the digested and undigested 
5 DNA are dectrophoresed on a 1.5% agarose gel, using a suitable molecular 
weight marker for comparison. The CViJI restriction fragments appear as a low 
molecular weight smear in the 20-200 bp range. 

By way of example, 1-10 ng of linearized pUC19 was labeled under 
the conditions described below. A template-primer cocktail was prepared by 
10 mixing 10 ng of linearized pUC19 DNA template with 20 ng pUC19 sequence- 

specific oligonucleotides (prepBied as described above) and the mixture is brought 
to a final volume of 17 fil with sterile distilled water. The template-primer 
mixture is denatured in a boiling water bath for 2 minutes and immediately placed 
on ice. 

IS The following labding mixture is then added to the template-primer 

mix:2.S fil lOX labeling buffer (500 mM Tris HQ at pH 9.0, 30 mM MgCl2, 
200 mM (NH4)2S04, 20mM dATP, 20/iM dTTP, 20/iM dGTP, 0.4% NP-40); 
5.0 pi [a-^^P] dCTP (3000Ci/mmol, lOftCi/Ml New England Nuclear, Catalog 
No. NEG013H); 0.5 pi Themus flavus DNA polymerase (5u//xl) (Molecular 

20 Biology Resources, Milwaukee, Wisconsin); up to 25 pi final volume with 
distilled water. The reaction was incubated at 70^C for 30 min and then stopped 
by adding 2^1 of 0.5M EDTA at pH 8.0 to the reaction mix. 

The efficiency of the labeling reaction is gauged by the percentage 
of radioisotope incorporated into labeled DNA. One microliter of the labeling 

25 reaction is added to 99 p\ of lOmM EDTA in a microcentrifuge tube. This serves 

as the source of diluted probe for total and trichloroacetic acid (TCA)-precipitable 
counts. 2 p\ of diluted probe is spotted onto the center of a glass fiber filter disc 
(Whatman number 934- AH). The disc is then allowed to dry and is then placed 
in a vial containing scintillation cocktail for counting total radioactivity in a liquid 
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scintillation counter. Another 2 /il aliquot from the diluted probe is added to 1 
ml of 10% ice cold TCA followed by the addition of 2 ;J of carrier bovine serum 
albumin (BSA). This mixture was then placed on ice for 10 minutes. The 
precipitate is then collected on a glass filter disc (Whatman No. 934-AH) by 
S vacuum filtration. The filter is then washed with 20ml of ice cold 10% TCA, 
allowed to dry and is placed in a vial containing scintillation cocktail and counted. 

Because primer extension oligonucleotide labeling results in net 
DNA synthesis, the specific activity of labeled DNA is calculated using the 
following guiddines. 

10 Total cpm incorporated = TCA cpm X 50 X 27 

Wherein the &ctor 50 is derived from using 2 /J of a 1:100 dilution for TCA 
precipitation. The number 27 converts this back to the total reaction volume 
(which is the reaction volume plus 2 /tl of stop solution). 

Synthesized DNA (ng of DNA synthesized) » 
15 theoretical yidd X fraction of radioactivity incorporated. 

Theoretical yield (ng of DNA) = fiCi dNTPs added x 4 X 33Qnp/nmole 

spedfic activity dNTP(Ci/mmole-MCi/nmole) 

Fraction of incorporated labd = TCA predpitated cpm/ total q>m. 

Specific activity (cpm/^g of DNA) = total cpm incorporated x 1000 
20 synthesized DNA + input DNA 

Wherein 1000 is the factor converting nanograms to micrograms. 
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By way of example, the following repiesoits the calculation of 
q)ecific activity for an aliquot of pUC19 DNA labeled using this method. Using 
50 fiCi of [a- ^2p)dCTP in a 25 ^1 reaction, and if the TCA precipitated cpm is 
26192 and total cpm is 102047; 

Total cpm incorporated = 26192 X 50 X 27 =3.27 x lO'^cpm 
Synthesized DNA (ng of DNA synthesized) = 
Theoretical yield X fraction of radioactivity incorporated. 



Theoretical yield = fiCi of dNTPs x 4 x 330 
3000 AtCi/nmole 

10 =5iUCuL4JL22Q 

3000 

- 22 ng 

Fraction of label incorporated = TCA precipitated cpm = 26192 = 0.256 

Total cpm 102047 



15 Synthesized DNA = 22 X 0.256 

s= 5.6 ng 



Specific activity (cpm /fig)= Total cpm incorporated x 1000 

Synthesized DNA +input DNA 

Input DNA = 10 ng 

20 Specific activity = 3.27 x lo'^x 1000 

5.6+10 
=2.09 X 10^ cpm/ftg 



Unincorporated radioactive label may be removed using standard 
methods well known in the art. 
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Comparisons were made betwem PEL-RGO vs RPL under similar 
conditions, and it was observed tfiat a detection limit of 100 fg was seen using 
PEL-RGO labeled DNA compared to a detection limit of 500 fg widi RPL, using 
a radiolabeled probe. 

S Example 12 

Tliennal Cycle Labeling and Universal Thermal Cycle Labeling 

Thermal Cycle Labeling (TCL) is a method according to the present 
invention for efficiently labeling double-stranded DNA while simultaneously 
amplifying large amounts of the labeled probe. TCL of DNA requires two 

10 general steps: 1) generation of the sequence-specific oligonucleotides by CwJI* 

restriction of the template DNA; and 2) repeated cycles of denaturation, 
annealing, and extension in the presence of a thermostable DNA polymerase or 
a functional fragment thereof which maintains polymerase activity. Optimal 
results are obtained aft^ 20 such cycles, which is best performed in an automated 

IS thermal cycling instnimmt such as a Perldn-Elmer Model 480 thermoc^der. In 
conjunction with such an instrument, about 1.5 hr. is required to complete this 
protocol. If a thmnal cycler is not available these reactions may be performed 
using heat blocks. As few as S cycles may yield probes with acceptable detection 
sensitivities. The generation of sequence specific oligonucleotides for use in this 

20 method may also be accomplished using the restriction endonuclease reagent 
CGase I described in Example 20 or the restriction mdonudease Aci I which has 
as a recognition sequence CCGC. 

Non-radioactive labeling of DNA using TCL is accomplished by 
mixing: 10 pg - 100 ng linearized template, 50 ng CviJI*-digested primers 

25 (prqjared as described above), 1.5 lOX labeling buffer, 0.5 fd Thermus flavus 

DNA polymerase (5u//il) (Molecular Biology Resources, Inc., Milwaukee, 
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Wisconsin), 1 of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 fil each of dATP, dCTP, and dGTP (2 mM), and 1.0 fd 2wM dTTP. 

Radioactive labeling of DNA using TCL was accomplished by 
mixing 10 pg - 100 ng of CvOI gmeiated primers, 10 pg-25 ng of linearized 
5 template, 1.5 fd of lOX labeling buffer, 5 /il of ^^p^cTP (3000 Ci/mmole, 10 
ftCi/fil or 40 fiCi/fdX 0.5 fil of Themtusflavus DNA polym^ase (5u/ai1), and 0.5 
pX each of dATP, dGTP, and dTTP (1 mM) was added. The reaction mix was 
brought to a volume of 15 ^1 with deionized H2O, overlaid with mineral oil and 
cycled through 20 rounds of denaturation, annealing and extension. A typical 

10 cycling regimen employed 20 cycles of denaturation at 91°C for 5 sec, annealing 

at 50^C for 5 sec and extension at 72°C for 30 sec. The reaction is then 
terminated by adding 1 ^1 of 0.5M EDTA, pH 8.0. The amplified, labeled probe 
is a veiy heterogeneous mixture of fragments, which appears as a smear when 
analyzed by agarose gel ^ectrophoresis. 

15 Universal thermal cycle labeling (UTCL) is a method according to 

Ae present invention for effidently labeling double-stranded DNA while 
simultaneously amplifying laige amounts of labeled probe. UTCL is unique in that 
no sequence information is required regarding the template. The extension 
primers are suppled endogenously via the holo-enzyme of the thermostable DNA 

20 polymerase and any anonymous DNA template can be labded by repeated cycles 
of denaturation, annealing, and extension in the presence of a labeled 
deoxynucleotide triphosphate. Optimal results are obtained after 20 such cycles, 
which is best performed in an automated thermal cycling instrument such as a 
Perkin-Elmer Model 480 thermocycler. In conjunction with such an instrument, 

25 about 1.5 hr are required to complete this protocol. If a thermal cycler is not 
available these reactions may be performed using heat blocks. As a few as 5 
cycles may yield probes with acceptable detection sensitivies. 

Non-radioactive labeling of DNA using UTCL is accomplished by 
mixing: 10 ng linearized template, 1.5 /xl lOX labeling buffer, 0.5 /il Ihermus 
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flavus DNA polym^ase (Su//tl) (Molecular Biology Resources, Inc., Milwaukee, 
Wisconsin), 1 /il of ImM Biotin-ll-dUTP (Enzo Diagnostics, New York, New 
York), 1.5 id each of dATP, dCTP, and dGTP (2 mM), and 1.0 /J 2niM dTTP. 

Radioactive labeling of DNA using UTCL was accomplished by 
5 mixing: 10 pg-100 ng of linearized template, 1.5 fd of lOX labeling buffer, 5 /d 
of 32p.dCTP (3000 Ci/mmole, 10 iiCUnl or 40 iiCUpX), 0.5 /il of Thermus flavus 
DNA polymerase (5u/^), and 0.5 ^1 each of dATP, dGTP, and dTTP (1 mM) 
was added. The reaction mix was brought to a volume of 15 /xl with deionized 
H2O, overlaid with mineral oil and cycled through 20 rounds of daiaturation, 
10 annealing and extension. A typical cycling regimen employed 20 cycles of 
denaturation at 91°C for 5 sec, annealing at 50°C for 5 sec and extension at 72°C 
for 30 sec. The reaction is then terminated by adding 1 fil of 0.5M EDTA, pH 
8.0. Hie amplified, labeled probe is a very heterogeneous mixture of fragments, 
which appears as a smear when analyzed by agarose gel electrpphoiesis. 

15 JBst^matiipp pf Pio-n dUTP iincOTPpyatiion; 

In order to estimate the level of incoiporation of biotin-ll-dUTP 
into DNA, a serial dilution from 1:10 to 1:10^ of the labeled probe (free of 
unincorporated biotin-1 1-dUTP) is made in TE (lOmM Tris, ImM EDTA, pH 8). 
A microliter of each dilution is placed on a neutral nylon membrane, and the 

20 DNA sample is bound to the membrane ^th^ by UV cross linking for 3 min or 
by baking at 80°C for 2 hr. 

The unbound sites on the membrane are blocked using a blocking 
buffer for 15 min at 25°C. Streptavidin-alkaline phosphatase (Gibco-BRL 
Gaithersburg, Maryland, Cat. No. 9545A) is added to the blocking buffer (0.058 

25 M Na2HP04, 0.017 M NaH2P04, 0.068 M NaCl, 0.02% sodium azide, 0.5% 

casein hydrolysate, 0.1 % Tween-20) at a 1:5000 dilution and incubated for a 30 
min., and the membrane is rinsed 3 times for 10 min. each with wash buffer (Ix 
PBS [0.058 M Na2HP04, 0.017 M NaH2P04. 0.068 M NaCl], 0.3% Tween, 
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0.2% sodium aadde), rinsed briefly (5 minutes) with AP buffer (100 mM NaQ, 
5 mM MgCl2, 100 mM Tris-Cl pH 9.5) and thm enough AP buffer containing 
4.0 fil/nd nitio blue tetrazolium (NBT) (Sigma Cat No. N6639), (Sigma Cat. No. 
B6777), and 3.5 /tl/ml of 5-bromo-4-chlQn>-3-indolyl phosphate (BOP) was added 
5 in oTdei to cover the membrane. The membrane is left in the dark for 
approximately 30 minutes or until the reaction is complete. The reaction is 
stopped by rinsing in 1 X PBS. 

Detection Sensitivities 
^^P-labeled probes generated by the protocol above described 

10 labelling detect as little as 25 zeptomoles (2.5 x 10"^^ moles) of a target 
sequence. As little as 10 pg of template DNA is enough to synthesize 5-10 ng of 
radiolabeled probe, which is sufficient for screening 5 Southmi blots. The 
radioactive versions of TCL and UTCL facilitate extremely high specific activities 
of labded probe (about 5 x 10^ cpm//£g DNA), whidi permits 5-10 fold lower 

15 detection limits than conventional labding protocols. The synthesis of higher 
specific activity probes is probably the net result of the sequence-specific 
oligonucleotide primers and their increased length when compared to the short 
random primers used in other labeling methods. In addition, the thermal cycling 
permits probe amplification. 

20 Biotin-labeled probes generated by the TCL and UTCX protocols 

detect as little as 25 zeptomoles (2.5 x 10*^0 moles) of a target sequence. A 15 
fd TCL or UTCL reaction yields as much as 5-10 fig of labeled DNA, enough to 
probe 5 to 10 Southern blots. Biotin-labeled TCL and UTCL probes provide a 
10 fold greater detection sensitivity when compared to RPL biotin probes. In 

25 addition, the thermal cycling permits probe amplification. 

Non-radioactive, biotinylated probes labeled by the TCL and UTCL 
methods were shown to have detection limits that are identical to the radioactive 
probes. These methods have the advantage of eliminating the need to work with 




wo PCTAJSM/03246 



-53- 

hazardous radioactive materials without sacrificing sensitivity. In addition, results 
&Te obtained from non-isotopic piobes in 3-4 hours compaied to 3-4 days for 
radiolabeled probes. The ability to substitute non-radioactive probes for 
radioactive probes may be very useful to clinical laboratories, which do not use 
S radioisotopes but do need greater detection sensitivides. Research laboratories 
favor the use of non-isotofnc systems if detection sensitivity is not an issue. The 
non-isotopic labeling version of the TCL and UTCL systems represent a major 
inq)rovement in labeling DNA probes. Non-radioactive probes generated by the 
methods of the present invention are also useful in the detection of RNA in situ, 

10 An advantage of this system is that labeling protocols of the present invention 

yield highly sensitive non-radioactive probes, and the size of the probes are 
predominantly in the small molecular weight range and can therefore penetrate the 
tissue easily, unlike RPL. Because non-radioactive probes labeled using the 
labeling protocols of the present invention have the same detection limits as do 

IS radioactive probes similarly labeled, it is within the scope of this invention to use 
dther radioactive or non-radioactive probes for probing, for example, Southern 
blots. Northern blots, for in situ hybridization for the detection of nvRNA or DNA 
in cells or tissue directly, and for colony or plaque lifts. 

Example 13 

20 Quasi-Random Fragmentatioii of DNA 

Shotgun doning and sequencing requires the generation of an 
overiapping population of DNA fragmaits. Therefore, conditions were 
established for the partial digestion of DNA with CViJI to produce an apparmtiy 
random pattern, or smear, of fragments in the appropriate size range. 
25 Conventional methods for obtaining partially restricted DNA include limiting the 
incubation time or limiting the amount of oizyme used in the digestion. Initially, 
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agarose gel electrophoresis and ethidium bromide staining of the treated DNA 
were utilized to assess the landonmess and size distribution of the fragments. 

CvOI was obtained from CHIMERx (Madison, ^sconsin). 
Digestion of pUC19 DNA for limited time pmods, or with limiting amounts of 
S CViJI under normal or relaxed conditions, did not produce a quasi-random 
restriction pattern, or smear. Instead, a number of discrete bands were observed, 
as shown in Figure 7, lane 3 for the CviJI* partial digestion of pUC19. Complete 
digests of pUC19 under normal and CviJI* buffer conditions are shown in lanes 
1 and 2 respectively. These results show that, under these relaxed conditions, 

10 'CvzTI has a strong restriction site preference. 

To eliminate the apparent restriction site preferences observed 
under the partial restriction conditions described above, a series of altered reaction 
conditions were explored. Conditions of high pH, low ionic strength, addition of 
solvents such as glycerol or dimethylsulfoxide, and/or substitution of Mn^"^ for 

IS Mg^~^ were systematically tested widi CvzJI mdonuclease using the plasmid 
pUC19. Figure 7 shows the results of these tests. In Lane M, a 100 bp DNA 
ladder was run. In Lanes 1-4, pUC19 DNA (LO §ig) was run after digestion at 
37^C in a 20 /il volume for the following times and conditions: Lane 1, complete 
CvOI digest (1 unit of enzyme for 90 min in 50 mM Tris-HCl, pH 8,0, 10 mM 

20 MgCl2, SO mM NaCl); Lane 2, complete CViJI*' digest (1 unit of «zyme for 90 
min in SO mM Tris-HCl, pH 8.0,10 mM MgCl2, SO mM NaCl, 1 mM ATP, 20 
mM DTT); Lane 3, partial CviJI^ digest (0.2S units of mzyme for 30 min in SO 
mM Tris-HCl, pH 8.0, 10 mM MgCl2, SO mM NaQ, 1 mM ATP, 20 mM 
DTT); Lane 4, partial CVJI** digest (0.5 units of enzyme for 60 min in 10 mM 

25 Tris-HQ, pH 8.0,10 mM MgCl2, 10 mM NaQ, 1 mM ATP, 20 mM DTT, 20% 

v/v DMSO); and Lane 5, uncut pUC19 (1.0 ^g). 

The digestion condition which yielded the best "smearing" pattern 
was obtained when the ionic strength of the relaxed reaction buffer was lowered 
and an organic solvent was added (Figure 7, lane 4). Plasmid pUC19 partially 
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digested under these conditions yields a idatively non-discrete smear. This 
activity is referred to as CviJl to di£ferentiate it from the originally- 
characterized star activity described in Xia et al. , NucL Acids Res. 15:6075-6090 
(1987). The appearance of diffuse, faint bands overlying a background smear 
5 generated from this 2686 bp molecule indicates that some weakly preferred or 
resistant restriction sites may bias the results of subsequent cloning «periments. 

DNA was mechanically sheared by sonication utilizing a Heat 
Systems Ultrasonics (Farmingdale, New York) W-375 cup horn sonicator as 
specified by Bankier et al., Methods in Enzymology 155:51-93 (1987). DNA 
10 fragmented by this method has random single-stranded overhanging ends (ragged 
ends). 

CvHl digested, and sonicated samples were size fractionated by 
agarose gel electrophoresis and electroeiution, or by spin columns packed with the 
size exclusion gel matrix, Sephaciyl S-SOO (Pharmacia XJCB, Piscataway N.J.) to 

15 eliminate small DNA fragments.. Spin columns (0.4 cm in diameter) were packed 

to a height of 1.3 cm by adding 1 ml of Sephacryl S-500 slurry and centrifiiging 
at 2000 RPM for 5 minutes in a Beckman CPR coitrifuge. The colunms were 
rinsed 3 times with 1 ml aliquots of 100 mM Tris-HQ (pH 8.0) by cmtrifiigation 
at 2000 RPM for 2 min. Typically, 0.2-2.0 of fragmented DNA in a total 

20 volume of 30 fd was applied to the column. The void volume, containing those 
DNA fragments larger than 500 bp, was recovered in the column eluant after 
spinning at 2000 RPM for 5 minutes. The capacity of this micro-column 
procedure is 2 /ig of DNA. Agarose gel electrophoresis and electroelution are 
described in detail by Sambrook et dl. Molecular Cloning: A Laboratory Manual, 

25 Second Edition Cold Spring Harbor Laboratory Press, Cold Spring Harbor N. Y. 

(1989) and is well known to those skilled in the art. In these experimrats, 5 fig 
of sample was pipetted into a 2 cm-wide slot on a 1% agarose gel. 
Electrophoresis was halted after the bromophenol blue tracking dye had migrated 



wo 94/21663 



PCTAJS94/03246 



-56- 

6 cm. Fragments larger than 750 bp, as judged by molecular size markers, were 
sq^arated from smaller sizes and electrophoresed onto dialysis tubing (1000 MW 
cutoff)- The fractionated matmal was extracted with phenol-chloroform and 
precipitated using ice cold ethanol (50% final volume) and ammonium acetate (2.5 
5 M final concentration). 

The ragged ends of the sonicated DNA were rendered blunt 
utilizing two different end repair reactions. In one end repair reaction (ER 1) 
sonicated DNA was treated according to the procedure outlined by BanVier et al. 
Methods in Enzymology 155:51-93 (1987), where 2.0 /ig of sonicated lambda 

10 DNA is combined with 10 units of the Klenow fragment of DNA polymerase I, 

10 units T4 DNA polymerase, 0.1 mM dNTPs, (deoxynucleotide 
triphosphates =deoxyadenosine triphosphate, deoxthymidine triphosphate, 
deoxycytosine triphosphate, and deoxyguanosine triphosphate) and reaction buffer 
(50 mM Tris-HQ, pH 7.5,10 mM MgCl2, 10 mM DTT). This mixture was 

15 incubated at room temperature for 30 min followed by heat denaturation of the 
enzymes at 65^C for 15 minutes. In a second end repair reaction (ER 2), an 
excess of the reagents and enzymes described above wm utilized to ensure a 
more efficient conversion to blunt ends. In this reaction, 0.2 fig of the sonicated 
lambda DNA sample was treated under the same reaction conditions described 

20 above. 

Figure 8 shows comparisons of the size distributions of sonicated 

DNA versus DNA that was partially digested with CvOI**. In Lanes M, a 1 Id) 

DNA ladder was run. In Lanes 1-3, untreated X DNA (0.25 /Ag), sonicated X 

DNA (1.0 ^tg), and CwJI** partially-digested X DNA (1.0 /xg) were nm, 

25 respectively. In Lanes 4-6, untreated pUC19 (0.25 fig), sonicated pUC19 (1.0 

mm 

fig)y and CviJl partially-digested pUC19 (1.0 fig) were run, respectively. 

Fragmentation of a large substrate such as lambda DNA (45 kb) 
revealed essentially no banding differences between the CviJI** method and 
soiucation, as demonstrated in Figure 8, lanes 2 and 3. In addition, pUC19 DNA 
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that was partially digested with CvO^"^ gave a size distribution or "smear** that 
closely resembled that achieved with sonication (Figure 8, lanes S and 6). As 
expected, the minor bias evident with a small molecule such as pUC19 was not 
detectable with a larger substrate such as lambda DNA. 
S The intensity and duration of sonic treatment affects the size 

distribution of the resulting DNA fragments. The results obtained from the 
sonication of lambda and pUC19 samples (Figure 8) were obtained from three 20 
second pulses at a power setting of 60 watts. Sonication-generated smears are 
similar, although the size distribution of fragments is consistently greater with 

10 CViJI fragmentation. This result favors the cloning of larger inserts, which 
facilitates the efficiency of end-closure strategies (Edwards et al,. Genome 6:593- 
608 (1990)). The size distribution of the DNA fragmented by CWJI** is 
controlled by incubation time and amount of enzyme, variables which are readily 
optimized by routine analysis. An excess of enzyme or a long incubation time 

15 will completely digest pUCI19 DNA, resulting in fragments which range in size 
from approximately 20 bp to approximatdy 150 bp (Figure 7, lanes 1 and 2). 
The results shown in Figure 8 wm obtained by incubating pUC19 for 40 minutes 
and lambda DNA for 60 minutes with 0.33 units of CvUl/fig substrate. The 
efiiciracies of the two methods for randomly fragmenting DNA were 

20 quantitatively analyzed for use in molecular cloning, as described below. 

Example 14 

Rapid DNA Size Fractionation Utilizing Spin Column Chromatography 

The amount of data obtained by the shotgtm sequencing approach 
is substantially increased if fragments of less than 500 bp are eliminated prior to 
25 the cloning step. Small fragments yield only a portion of the sequence data which 
may be collected fr^m polyacrylamide gel based separations and, thus, such small 
fragments lower the efficiency of this strategy. Agarose gel electrophoresis 
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followed by electrodution is commonly used to size fractionate DNA prior to 
shotgun cloning (Banlder et a!.. Methods in Enzymol. 155:51-93 (1987)). 
Approximat^y three hours are required to prepare the agarose gel, electrqphoiese 
the sample, electrocute fragments larger than 500 bp, perform phenol-chloroform 
5 extractions, and precipitate the resulting material. 

The results of 5 out of 9 independrat trials size-fractionating 
Cwn**-ftagmented lambda DNA by agarose gel electrophoresis are shown in 
Figures 9A-E. Figures 9A-D illustrate the following. In Figure 9A: Lane M, 

1 Id) DNA ladder; lane X, untreated X DNA (0.25 ^g); lane 1, unfractionated 
10 (UF) CviJl** partially-digested X DNA (1.0 ftg) ; lane 2, column-fractionated (CF) 

CVfJI** partially-digested X DNA (1.0 fig); lane 3, gel-fractionated (GF) CwTI** 
partially-digested X DNA (1.0 fig); and in Figures 9B-E are additional trials of the 
same treatmmts as in the lanes of Figure 9A which have the same label. 

Small DNA iragments may also be removed by passing the sample 

15 through a short column of Sqphacryl S-500. Approximately 15 min. are needed 
to prepare the colunm and 5 min. to fractionate the DNA by this method. 

The results of three out of nine trials using a Sephacryl S-500 
colunm are shown in Figures 9A-C. The efficiency of eliminating small DNA 
fragments (<500 bp) by spin column chromatography appears high, and the 

20 reproducibility was excellent. This result is in contrast to the agarose gel 
electrophoresis and electroelution data presented in Figures 9A-E wherein nine 
replicate trials of this method yielded nine differently sized products, regardless 
of the source of the agarose. Both methods yielded 30-40% recoveries as 
measured by UV spectrophotometry. To quantitate the relative efficiencies of the 

25 two fractionation methods, the lambda DNA size fractionated in Figure 9A lanes 

2 and 3, and Figure 9B lane 3 were analyzed for cloning efficiency and insert 
size, as described below. 
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Example 15 
Cloning Efficiencies of Gel Elution and 
Chromatography itectionation Methods 

The efficacy of size selection was quantified by two criteria: 1) by 
5 comparing the relative cloning efficiency of CviJI** partially-digested lambda 
DNA fragments fractionated either by agarose gel electrophoresis and 
electroelution or micro-column chromatography, and 2) determining the size 
distribution of the resulting cloned inserts. To reduce potential variables, large 
quantities of the cloning vector and ligation cocktail were piepaxedy ligation 

10 reactions and transformation of competent E. coli were performed on the same 
day, numerous redundant controls were performed, and aU cloning experiments 
were repeated twice. Ligation reactions were carried out overnight at 12°C in 20 
id mixtures using the following conditions: 25 mM Tris-HCl (pH 7.8), 10 mM 
Mga2, 1 mM DTT, 1 mM ATP, DNA, and 2000 units of T4 DNA ligase. For 

IS unfractionated samples, 10 ng of fragments and 100 ng of ffincll-restricted, 
dephosphorylated pUC19 were combined under the above conditions. For 
Sq>hacryl S-SOO fractionated samples, 50 ng of size-selected fragmmts were 
ligated with 100 ng of HincU-restricted, dephosphorylated pUC19. This increase 
in fractionated DNA was determined empirically to compmsate for the lower 

20 concentration of "mds" resulting from the fractionation procedure and/or the 
lowered efficiency of cloning larger fragments. Ligation reaction products were 
added to competent E. coli DH5aF' (08Od/acZAM15 A(/flcZYA-flrgF)U169 deoBi 
gyrA96 recAl relAl endAl thi-1 hsdR17(T^',m^'^) 5i4pE44 X-) in a 
transformation mixture as specified by the manufacturer (Life Technologies, 

25 Bethesda, Maryland) and aliquots of the transformation mixture were plated on 

T agar (Messing, Methods in EnzymoL 101:20-78 (1983)) containing 20 /zg/ml 
ampidllin, 25 ^1 of a 2% solution of isopropylthiogalactoside (IFTG) and 25 ^tl 
of a 2% solution of 5-dibromo-4-chloro-3-indolylgalactoside (X-GAL). The 
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doning efficiencies reported are the average of triplicate platings of each ligation 
reaction. The concentration of the £ractionated material was checked 
sfpectrophotometrically so that SO ng was added to all ligation reactions. This 
material was. ligated to JHincII-digested and dephosphorylated pUC19. This 
5 cloning vector was chosen because it permits a simple blue to white visual assay 
to indicate whether a DNA ftagment was cloned (white) or not (blue) (Messing, 
Methods in EnzymoL 101:20-78 (1983)). 

A summary of the cloning efficiencies calculated from two 
independent trials is given in Table 3. 
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TABLE 3 

Cloning ^Ificiencies of CviJI** Partially Digested Lambda DNA 
Fractionated by Aficrocolunui Cliromatograptay Versus Agarose Gel 
laectroeliition. 



TMall 



Triain 



Colony Phenotype 



DNA/treatment £l]ig 

Supercoiled pUC19 55000 

pUC19/Hincn/CIAP 210 
10 . pUC19/HincII/C3AP/ 150 
T4 DKA Ugase 

X/Cvin** partial/CaF 140 
+ pUC19 

X/Cvin** partial/C9^1 98 
15 + pUC19 

X/CviJI** partiaI/GFE2 82 

+ pUC19 



White 
<10 

<1 

4 

240 
49 
54 



Blue 

50000 
320 
210 

210 
200 
95 



White 
<10 

1 

7 

240 
18 
74 



Cloning efficiencies leflect the number of ampicillin-iesistant 
colonies/ng pUC19 DNA. CIAP represents treatment with calf intestinal alkaline 

20 phosphatase used to dephosphorylate HincII-digested pUC19 to mininiize self- 
Ugation. CF refers to DNA that was fractionated on Sephaoyl S-500 columns as 
described aboye. GFEl and GFE2 refer to two runs wherein DNA was 
fractionated by agarose gel electrophoresis and electroeluted. X refns to 
bacteriophage X DNA. 

25 These trials represent repeated e3q)eriments in whic^ X DNA 

fragmmts generated by CnJl** partial digestion were ligated to flincll-lineaiized, 
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dephosphorylated pUC19 and transformed into DHSa F' competent cells described 
above. Hie first three rows in Table 2 show controls performed to establish a 
baseline to better evaluate the various treatments. Supeicoiled pUC19 transforms 
£ coll 10 times more efficiently than the ffincU-digested plasmid and 150-260 
5 times more efficiently than the flz/icll-digested and dephosphorylated plasmid. 

The number of blue and white colonies which resulted from transforming HincU- 
cut and dephosphorylated pUC19 was determined both before and aftsr treatment 
with T4 DNA ligase in order to differentiate these background events from 
cloning inserts. The backgroimd of blue colonies (which represent the uncut 

10 and/or non-^ephosphorylated population of molecules) averaged 0.4%, compared 

to supercoiled plasmid. The background of white colonies (which presumably 
results from contaminating nucleases in the enzyme treatments or genomic DNA 
in the plasmid preparations) after JETi/tcII-digestion, dephosphorylation, and ligation 
of pUC19 averaged 0.014% as compared to the supercoiled plasmid. 

15 The number of white colonies obtained when micro-column 

fractionated DNA was cloned into pUC19 was 240/ng vector in both trials. The 
efficiency of cloning gel fractionated and electroeluted DNA ranged from 18-74 
white colonies/ng vector. The data show that column fractionated DNA results 
in three to thirteen times the number of white colonies, and presumably 

20 recombinant inserts, as gel fractionated and electroeluted DNA. The size 

distribution of the inserts present in these white colonies is depicted in Figures 
lOA-C. In Figure lOA, a CwJI** partial digest of 2/xg of X DNA was size 
fractionated on a 4 mm by 13 mm column of Sephacryl S-500 at 2,000 x g f or 5 
minutes. The void volume containing partially digested DNA was directly ligated 

25 to linear, dq^hosphorylated pUC19 and 43 resulting clones were analyzed for 

insert size. The DNA for this experiment is the same as that shown in Figure 
9A, lane 2. In Figure lOB, a CvJl"** partial digest of 5 ^g of X DNA was size 
fractionated by agarose gel electroelution. The eluted DNA was phenol-extracted 
and ligated to linear, dephosphorylated pUC19, and the resulting 40 clones were 
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analyzed for insert size. The DNA for this ejcpenment is the same as that shown 
in Figure 9A, lane 3. In Figure IOC, the procedure is the same as in Figure 9B, 
acept the DNA for this e^qpeiiment came from Figure 9B, lane 3. 

A total of 43 random clones obtained from micro-column 
5 chromatography fractionation were analyzed for insert size (as shown in Figure 
lOA). Most of these inserts were larger than 500 bp (37/43 or 86%), 11.6% 
(5/43) were smalls than 500 bp, and one clone (2.3%) was smaller than 250 bp. 
The average insert size was 1630 bp. These results arc in contrast to those 
obtained by agarose gel fractionation (as shown in Figures lOB and IOC). In the 
10 first trial (Figure lOB) most of the inserts w^e smaller than 500 bp (26/37 or 
70.3%) and only 29.7% (11/37) were larger than 500 bp in size. In the second 
trial (Figure IOC) all of the inserts (40 total) were smaller than 500 bp. Thus, 
the use of agarose gel electroelution for the size fractionation of DNA results in 
unexpectedly variable and low doning efficiencies. 

15 Example 16 

Cloning Sonicated and CVCr['^*-Dige5ted Lambda DNA 

To compare the cloning efficimcies of sonicated and CWJI*"^* 
digested nucleic acid, X DNA was fragmented by each of these methods and 
ligated to pUCI9 which was linearized with HincH and dephosphorylated to 

20 minimize self-ligation. 

DNA fragmented by CwJI digestion and sonication was cloned 
both before and after Sephacryl S-500 size fractionation. Sonicated lambda DNA 
was subjected to an end repair treatment prior to ligation. Ligations were 
performed as described in Example 11. One-tenth of the ligation reaction (2 /d) 

25 was utilized in the transformation procedure, and the fraction of nonrecombinant 

(blue) versus recombinant (white) colonies was used to calculate the efficiency of 
this process. 
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The efficacy of the methods was quantified by compaiing the 
cloning efficiency of lambda DNA fragments gmexated either by sonication or 
CvkJI partial digestion. To reduce potential cloning difFerences based on size 
preference, the size distribution of the DNA generated by these two methods was 
5 closely matched. Other experimental details were designed to reduce potratial 
variables, as described above. Certain variables were unavoidable, however. For 
example, the sonicated DNA fragments required an mzymatic step to repair the 
ragged ends as described in Example 1 prior to ligation, whereas the CVfll** 
digests were heat-denatured and directly ligated to HincTL digested pUC19. 
10 A summary of the cloning efficiencies calculated from two 

independent trials is given in Table 4, section A (unfractionated samples), and 
Section B (fractionated samples). 
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Cloning efBciencies represent the numb^ of ampicillin-resistant 
colonies/ng pUC19 DNA. CIAP indicates treatment with calf intestinal alkaline 
phosphatase. ER 1 and £R 2 aie end repair methods described in Example 13. 
X lefers to bacteriophage lambda. 
5 The indicated trials represent repeated experiments in which two 

identical sets of lambda DNA fragments generated by Alul complete digestion, 
CvOl partial digestion, or sonication were each ligated to Hi/icII-linearized, 
dephosphorylated pUC19 and transformed into DHSaF' competent cells. The 
cloning efficiencies reported are the average of triplicate platings of each ligation 

10 reaction. In case the Sephacryl S-500 size fractionation step introduced inhibitors 
of ligation or transformation or resulted in differKices attributable to the size of 
the material, the sonicated and CviJI**-digested samples were ligated with pUC19 
both prior to (A) and after (B) the fractionation steps. The first three rows in 
Table 4, sections A and B, are controls performed to establish a baseline to better 

IS evaluate the various treatments. Hiese data show tiiat supercoiled pUC19 
transforms E. coli 200-1000 times more efficiently than the ffiTicH-restricted and 
dephosphorylated plasmid. Without this dq>hosphorylation step, the cloning 
efficiency is 10% that of the supercoiled molecule (data not presmted). The 
background of blue colonies averaged 0.5% in these experiments, compared to 

20 supercoiled plasmid, while the background of white colonies averaged 0.005%. 

A conq>arison of the data from unftactionated versus fractionated 
san^les in Table 4, sections A and B, reveals a gmeral decline in the nimnb^ of 
white and blue colonies obtained after sizing. This decrease is primarily due to 
the fact that cloning efficiencies are dependent upon the size of the fragm«it, 

25 favoring smaller fragments and thus giving higher efficiencies for the 

unfractionated material. This is illustrated by comparing the efficiaicy of cloning 
unfractionated and fractionated X DNA which was completely restricted with Alul, 
This four base recognition endonuclease produces blunt ends and cuts X DNA 
(48,502 bp) at 143 sites. Only 25 of the resulting 144 fragments (17%) are larger 
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than 500 bp. The number of white colonies obtained when uniractionated X 
DNA, completely restricted with AM, was cloned into pUa9 ranged from 250- 
400/ng vector, versus 23-48/ng vector for the fractionated material. This ten fold 
decrease was only noticed for the X Alu I digests, and probably reflects the large 
S portion of small molecular wdght fragments (qsproximately 75%) which is 
excluded from the fractionated ligation reactions. 

The number of white colonies obtained when unfractionated CWJI** 
treated X DNA was cloned into pUC19 ranged from 160-340/ng vector, versus 68- 
90 white colonies/ng vector if the same material was fractionated. Unfractionated 

10 X DNA, completely digested with AIul, results in cloning efficiencies very similar 

to unfractionated CViJi'*'* treated DNA. Sonicated X DNA is a poor substrate for 
ligation, compared to CViJI** treatment, as indicated by the roughly ten-fold 
reduced cloning efficiencies. 

B:izymatic rqpair of the ragged ends produced by sonication results 

15 in an increased cloning efficiency. Using conditions described in &cample 13 for 
the first end repair treatment (ER. 1), 10-44 (fractionated) and 19-32 
(unfractionated) white colonies/ng vector were observed. However, £R 1 
conditions may not be optimal, as an alternate end repair reaction (ER 2) (as 
described in Example 13) resulted in greater numbers of white colonies (63 and 

20 100/ng vector for fractionated and unfractionated DNA, respectively). In this 
reaction, a ten-fold excess of reagents and ^izymes were utilized to repair the 
sonicated DNA, which apparently improved the effidlmcy of cloning such 
molecules by two to three fold. The data collected from multiple cloning trials 
in Table 3, sections A and B, show that CVlTI** partial digestion results in three 

25 to sixteen times the number of white colonies than sonicated ER 1-treated DNA. 

Even with an optimal end repair reaction for the sonicated fragments, DNA 
treated with CWJI** yielded three times more white colonies. 
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Example 17 

Analysis of Cvijl** Fragmentation for Shotgun Cloning and Sequencing 

The ability of Cvfll** partial digestion to create uniformly 
representative clone libraries for DNA sequencing was tested on pUC19 DNA.. 
pUC19 DNA was digested under CWJI** conditions and size firactionated as 
described above. The fractionated DNA was cloned into the EcoRV site of 
MI3SPSI, a lacZ minus vector constructed by adding an EcdKV restriction site 
to wild type M13 at position 5605. M13SPSI lacks a genetic cloning selection 
trait, therefore after ligation of the pUC19 fragments into the vector the sample 
was restricted with EcoRV to reduce the background of nonrecombinant plaques. 
Bacteriophage M13 plaques were picked at random and grown for 5-7 hours in 2 
ml of 2XTY broth containing 20 ^1 of a DH5aF' overnight culture. After 
centrifiigation to remove the cells, single-stranded phage DNA was purified using 
Sq>haglass™ as specified by the manufacture (Pharmacia LKB, Piscataway New 
Jersey). The single-stranded DNA was sequenced by the dideoxy chain 
termination method using a radiolabeled M13-specific primer and Bst DNA 
polymerase (Mead et al., Biotechnigues 11:76-87 (1991)). The first 100 bases of 
76 randomly chosen clones were sequenced to determine which CwJI recognition 
site was utilized, the orientation of each insert and how effectively the cloned 
fragments covered the entire molecule, as shown in Figure 11. The positions of 
the 45 normal CwJI sites (PuGCPy) in pUC19 are indicated bmeath the line 
labeled "NORMAL" in the Figure 11. Similarly, the 160 Cvof sites (GC) are 
indicated beneath the line labeled "RELAXED" in Figure 11. The marks above 
these lines indicate the CViJI** pUC19 sites which were found in the set of 76 
sequenced random clones. The frequency of cloning a particular site is indicated 
by the height of the line, and the left or right orientation of each clone is also 
indicated at the top of each mark. There are a total of 205 CvHL and CTWJI* sites 
in pUC19. 
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The data presented in Figure 1 1 demonstrate that, under CVzJI 
partial conditions, normal CvOI sites are preferentially restricted over relaxed 
(CWJI*) sites. Of the 76 clones that were analyzed, only 1396, or 1 in 7, had 
sequ^ice junctions corresponding to a relaxed CWJl"* site. Thirty-five of the 
S forty-five possible normal restriction sites were cloned, as compared to eight of 
the possible one hundred sixty relaxed sites. If the &izyme had exhibited no 
preference for normal or relaxed sites under the CviJI partial conditions utilized 
here, then 78% of the sequence junctions analyzed should have been generated by 
cleavage at a relaxed CviJl* site. It may be noted that the relaxed CViJI* 

10 restriction sites that were found appear to be clustered in two regions of the 
plasmid that are deficient in normal CVzJI sites. In addition, the combined 
distribution of the normal and relaxed sites which were restricted to generate the 
76 clones appears to be quasi-random. That is, the longest gap between cloned 
restriction sites was no greats dian 250 bp and no one particular site is over- 

15 utilized. 

A detailed analysis of the distribution of CvOI sequence junctions 
found from cdoning pUC19 is presented in Table 5. 
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The GC sites in pUC19 may be divided into four classes based on 
their flanking Pu/Py structure. The fraction of GC sites observed in pUC19 which 
belong to each classification is roughly equal ^2.0*27.8%). A striking diffoence 
was found between the observed distribution in pUC19 of normal and relaxed (Rl, 
5 R2, R3) CVOI recognition sites and the distribution revealed by shotgun cloning 
and sequence analysis of CVai**-treated DNA. While most of the sites cleaved 
by this treatment were found to be PuGCPy (about 87%), or "normal" restriction 
sites, a significant fraction of the cleavage occurred at PyGCPy (about 6.5%) and 
PuGCPu (about 6.6%) sites, considering the short incubation times and limiting 

10 enzyme concentrations. The latter two categories of sites, and presumably the 
PyGCPu sites as well, are completely restricted under "relaxed** conditions, 
provided an excess of enzyme is present and sufficient time is allowed (see Figure 
7, and Xia ef al. Nucleic Acids Res. 15:6075-6090 (1987)). 

Digestion using CviJl'*''^ treatment results in a relatively even 

15 distribution of breakage points across the length of the molecule (as shown in 
Figure 11). As described above, Figure 11 dqncts a linear map of pUC19 
showing the relative position of the lacZ' gene (a pq>tide of /3-galactosidase gene) 
and ampicillin resistance gene (Amp). The marks extending boieath the top line 
(labeled "NORMAL") show the relative portion of the 45 normal CViJI sites 

20 (PuGCPy) present in pUC19. The marks above the line are the cleavage sites 
found from sequencing the CWJI** partial library. The height of the line 
indicates the number of clones obtained from cleavage at that site, and the 
orientation of the flag designates the right or left orientation of Ae respective 
clone. The marks extending beneath the second line (labeled "RELAXED") show 

25 the relative positions of the 160 CviJI* sites (GC) present in pUC19. Those marks 

above the line were found from sequencing the CwJI** partial library. The 
bottom portion of Figure 11 shows the relative position and orientation of the first 
20 clones sequenced, assuming a 350 bp read per clone. CviJI** cleavage at 
relaxed sites appears to be important in "filling gaps'* left by normal restriction. 
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The primary goal of this effort was to determine the efficacy of 
these methods for rapid shotgun cloning and sequencing. For these purposes, 
only 100 bases of sequence data were acquired per clone. However, if 350 bases 
of sequence had been determined from each clone, then the entire sequence of 
5 pUC19 would have been assembled from the overlap of the first 20 clones (Figure 

11). In this sequencing simulation 75% of pUC19 would have been sequenced 
at least 2 times from the first 20 clones. The highest degree of overfold 
sequencing would have been 6, and only involved 2.2% of the DNA. Figure 11 
also shows that most of the Ix sequencing coverage occurred in a region of the 

10 plasmid with a very low density of normal and relaxed CVfJI restriction sites. 

Most of the single coverage occurs in a 240 bp region of the plasmid between 
1490 bp and 1730 bp where there are only 4 CViJI rdaxed sites. It should also 
be noted that by the 27th randomly picked clone most of this region would have 
been covered a second time. 

15 Shotgun sequencing strategies are efficient for accumulating the 

first 80-95% of the sequmce data. However, the random nature of the method 
means that the rate at which new sequence is accumulated decreases as more 
clones are analyzed. In Figure 12 the total amount of unique pUC19 sequence 
accumulated was plotted as a function of the number of clones sequenced. The 

20 points rq)resent a plot of the total amount of determined pUC19 sequence versus 
the total number of clones sequenced. The horizontal dashed line demarcates the 
2686 bp length of pUC19. The smooth curve represents a continuous plot of the 
discrete function S(N)=NI-e"^^[((e^^-l)/c)+(l-s)]. The theoretical accumulation 
curve expected for a process in which sequence information is acquired in a 

25 totally random fashion is also shown. The smooth curve is a continuous plot of 

the discrete function S(N) where 

S(N) =NLe-^^[((e*^^<l)/c + (1-a)] . 
This equation is based up>on the results developed by Lander et al. , Genomics 
2:231-239 (1988) for the progress of contig generation in genetic mapping. In the 
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equation: N is the number of clones sequenced, L is the length of clone insert in 
bp, c is the redundancy of cov^age or LN/G (where G is length of fcagmmt 
being sequmced in bp), and a ^ 1-6, where 6 is the fraction of length that two 
clones must share. The curve in Figure 12 was calculated with G = 2686 bp, L 
5 = 350 bp, and a = 1. The plotted points lie close to the theoretical curve, and 
it thus api>ears that the sequence of pUC19 was accumulated in an apparent 
random fashion utilizing Cvzll'*''*' fragmentation and colimm fractionation. 

Example 18 

Shotgun Cloning Utilizing 200 ng of Lambda DNA 

10 Generally, 2-5 /xg of DNA are needed for the sonication and 

agarose gel fractionation method of shotgun cloning in order to provide the 
several hundred colonies or plaques required for sequence analysis (Bankier et dl. 
MeOiods in En^ymoL 155:51-93 (1987)). A ten-fold reduction in the amount of 
substrate required greatiy simplifies the construction of such libraries, especially 

15 from large genomes, (Davidson, /. DNA Sequencing and Mapping 1:389-394 
(1991)). The efficiency of constructing a large sho^n library from nanogram 
amoimts of substrate was tested utilizing 200 ng of CVai'*"*'-digested lambda DNA. 
This material was column-fractionated as described previously. In this case, 1/2 
of the column eluant (15 ^tl containing 50 ng of DNA) was ligated to 100 ng of 

20 HiRcn-digested and dephosphorylated pUC19 as described in Example 15. The 
. cloning efficiencies of the control DNAs were similar to those rqx)rted in Tables 
2 and 3. The 50 ng cloning experiment yielded 230 white colonies per ligation 
reaction in one trial, and 410 white colonies per ligation reaction in a second trial. 
Thus, it should be possible to routinely construct useful quasi-random shotgun 

25 libraries from as littie as 0.2 - 0.5 fcg of starting material. 
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Escample 19 
Epitope Mapping 

CViJI* recognizes the sequence GC (except for PyGCPu) in the 
target DNA. Under partial restriction conditions the length of fragment may be 
S controUed by incubation time. Epitope mapping using CMJl*'^ partial digests 

involves generating DNA fragments of 100-300 bp from a cDNA coding for the 
protein of interest, by methods described in Example 13, inserting them into an 
M13 expression vector, plating out on solid media, lifting plaques onto a 
membrane, screening for binding to the ligand of interest, and picking the positive 

10 plaques for isolation of the DNA, which is then sequenced to identify the epitope. 
Thus, the same epitope may be expressed as a small fragment or a larger 
fragment. This approach allows one to determine the smallest fragment 
containing the epitope of intmst using functional assays such as binding to an 
antibody or other ligand, or using a direct assay for activity. For insertion into 

IS an MI3 vector, linkers may be added to the fragmoits or the insert may be 
dephosphorylated to ensure that each fragment is cloned alone without ligation of 
multiple inserts. 

The expression vectors recommended for subcloning of the CVOI 
fragments are Lambda Zap (Stratagene, LaJolla, California) or bacteriophage 
20 M13-epitope display vectors. An advantage of using an M13-based vector is that 
the peptide or protein of interest may be displayed along with the M13 coat 
protein and does not require host cell lysis in order to analyze the protein of 
interest. The lambda-based vectors yield plaques and hence the protein can be 
directly bound to a membrane filter. 
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Eaounple 20 
CGasel 



10 



15 



20 



25 



CGase I as used herein^ refers to a restriction endonuclease reagent which 
cleaves DNA at the dinucleotide CG. CGase I activity is based on the combined 
star activities of the restriction endonucleases Hpa n and Taq I. Under normal 
reaction conditions (10 mM Bis Tris Propane-HCl pH 7.0, 10 mM MgCl2, 1 mM 
DTT; 1 unit of enzymo/fig DNA, 37°C for 1 hr), Hpa U recognizes CCGG and 
cleaves after the first C to leave a 2-base 5' overhang. Under normal reaction 
conditions (100 mM NaCl, 10 mM Tris-HCl pH 8.4, 10 mM MgCl2, 10 mM 2- 
mercaptoethanol, 1 unit of enzyme//xg DNA, 65° C for 1 hr) the restriction 
endonuclease Taq I recognizes TCGA and cleaves zftex the T to leave a 2-base 
5' overhang. 

Reaction conditions have h&en described for Taq I* activity which decrease 
the cleavage specificity of Taq I (10 mM Tris-HCl pH 9.0, 5 mM MgCl2, 6 mM 
2-mercaptoethanol, 20% DMSO; 2000 units of enzyme//xg DNA, 6S^C for 1 hr) 
(Barany, Gene, 65:149-165 (1988)). These reaction conditions allow Taq I* to 
cleave DNA at the following sequences: 

Taq I* TCGA 
CCGA (TCGG) 
ACGA (TCGT) 
TCTA (TAGA) 
TCAA (TTGA) 
GCGA CTCGC) 

We are unaware of any literature descriptions of Hpa conditions. 
However, the following conditions were established to promote Hpa II'*' activity 
which are also compa^le with Taq I* activity: 5 mM KCl, 10 mM Tris-HCl pH 
8.5, 10 mM MgCl2, 1 mM DTT, 15% DMSO, 100 ug/ml BSA (CGase buffer); 
50 units of enzyme/fig DNA 50°C for 1 hr. The Hpa recognition sites were 
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detennined by cloning and sequencing Hpa lestiicted ftagments. The 
chazacterized Hpa recognition sequences are as follows: 

Hpan* CCGG 
5 CCGC (GCGG) 

CCGA (TCGG) 
ACGG (CCGT) 

Taq I (400 units/ftg DNA) and Hpa 11 (50 units/ftg DNA) were thra 
combined (CGase I) in CGase I buffer and the following recognition sites were 
0 identified by cloning and sequencing restricted pUC19 fragments. 

CGase I GCGC 
TCGA 
CCGG 
GCGT 

5 ACGA 

ACGG (CCGT) 
GCGG (CCGC) 
CCGA (TCGG) 

CGase I restriction of natural DNA, (i.e. pUC19, lambda), results in fragments 
0 ranging from 20-200 bp in length (average 20-60 bp). Heat denaturation of these 
firagmmts generates numerous oligonucleotides of variable length but precise 
specificity for the cognate template as was the case with CvU I*^ digestion. CGase 
I restriction of the small plasmid pUC19 (2689 bp) theoretically yidds 174 
restriction fragments, or 384 oligonucleotides after a heat denaturation step. 
5 The "two-cutter* activity of CviJ l"^ and CGase I represent a unique class 

of restriction endonuclease activity in that no oAer known restriction 
endonucleases will generate this size range of oligonucleotides. The ability to 
generate numerous oligonucleotides with perfect sequence specificity fix>m any 
DNA, without regard to sequence composition, genetic origin, or prior sequence 
0 knowledge is one of the properties that CGase I shares with CviJ I . In addition, 
the generation of numerous oligonucleotides by CviJ I or CGase I results in a 
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fom of probe or piim^ amplification not practical using conventional means of 
organic synthesis. 

Based on ability to recognize a dinucleotide sequence, the present invention 
contemplates the intmhangeability of CGase I with CviJ in all of the 
S supplications described horein. 

Example 21 

Purification of CviJ I Restriction Endonuclease from 
IL-BA-Infected ChloreUa Cells 

CviJ I was prepared by a modification of the method described by 

10 Xia et al., Nucl. Acids Res, 15:6025-6090 (1987). ChloreUa NC64A cells 

(ATCC Accession No. 75399 deposited on January 21, 1993, American Type 
Culture Collection, Rockville, Maryland) were infected with the virus IL-3A 
(ATCC Accession No. 75354 deposited November 6, 1992, Ammcan Type 
Culture Collection, Rockville, Maryland) according to Van Etten et al. , Virology 

15 126: 1 17-125 (1983). Five grams of IL-3A infected ChloreUa NC64A ceUs were 

suspended in a glass homogenization flask with 15 g of 0.3 mm glass beads in 
buffer A (10 mM Tris-HQ pH 7.9, 10 mM 2-mercaptoethanol, SO Mg/ml 
phenylmethylsulfonyl fluoride (PMSF), 20 ug/ml benzamidine, 2 ^g/ml o- 
phenanthroline). Cell lysis was canied out at 4000 ipm for 90 sec in a Braun 

20 MSK mechanical homogenizer (Allentown, PA) with cooling from a CO2 tank. 

After lysis 2 M NaCl was added to a final concentration of 200 mM, after which 
10% polyethyleneimine (PEI) (Life Technologies, Bethesda, MD) (pH 7.5) was 
added to a final concentration of 0.3%. The mixture was then stirred for 2 hrs. 
at 4°C then centrifuged for 1 hr. at 50,000 g. Ammonium sulfate was added to 

25 the supernatant to 70% saturation and stirred overnight. A protein pellet was 

recovered by centrifugation for 1 hr. at 50,000 g. The resulting pellet was 
dissolved in 20 ml of buffer B (20 mM Tris-acetate pH 7.5, 0.5 mM EDTA, 10 
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mM 2-inercaptoethanol, 10% glycerol, 30 mM KCl, SO ug/ml PMSF, 20 iig/inl 
benzamidine [Sigma, St. Louis, ^fissouii], 2 fig/xvl o-pheiianthioline [Sigma]) and 
dialysed against SOO ml of bu£fer B with 3 changes. The dialysed solution was 
then ^jplied to 1 x 6 cm Heparin-Sepharose (Phannacia LKB, Piscataway, New 
S Jersey) column. After a SO ml wash with buffer B, a 100 ml gradient of 0 to 0.7 
M KCl in buffer B was run. Fractions having CvU I activity as measured by 
digestion of pUC19 DNA and agarose gel electrophoresis, were pooled, diluted 
in 5 volumes of buffer C (10 mM K/P04 pH 7.4, 0.5 mM EDTA, 10 mM 2- 
mercaptoethanol, 75 mM NaCl,0.05% Triton X-100, 10% glycerol, 50 fig/ml 

10 PMSF, 20 /tg/ml benzamidine, 2 fcg/ml o-phenanthroline) and applied to a 1 x 7 
cm Phosphocellulose Pll (Whatman) colunm equilibrated in buffer C. After 
washing with 30 ml of buffer C, CviJ I was eluted by a 100 ml gradient of 0 to 
0.7 M NaCl in buffer C. At this step CviT I activity separated from non-specific 
nudeases. CviJ I containing fractions were pooled and diluted in 4 volumes of 

IS buffer C and applied to a 1 x 4 cm hydroxyapatite HTP colunm (BioRad, 
Hmules, CA). After washing with 30 ml of buffer C, CviJ I was eluted by a 0 
to 0.7 M potasium phosphate (pH 7.4) giadifflt in buffer C. Active fractions 
containing CviJ I activity and lacking non-specific nuclease activity were pooled 
and were dialysed overnight against storage buffer (SO mM potassium phosphate 

20 200 mM KCl, O.S mM EDTA, S0% glycerol, 20 ug/ml PMSF were pooled) and 
stored at -20^C. 

Although the present invention has been described in types of 
preferred embodiments, it is intended that the present invention encompass all 
modifications and variations which occur to those skilled in the art upon 
25 consideration of the disclosure herein, and in particular those embodiments which 
are within the broadest proper interpretation of the claims and their requirements. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



<i) 



APPLICANT: Molecular Biology Resourcee, Inc. 

TITLE OF INVENTION: Materials and Methods for 

Restriction Endonuclease Applications 



(ii) 



(iii) NUMBER OF SEQUENCES: 13 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Marshall, O'Toole, Gersteln, Murray fi Borun 

(B) STREET: 6300 Sears Tower, 233 South Wacker Drive 

(C) CITY: Chicago 

(D) STATE: Illinois 

(E) COUNTRY: United States of America 

(F) ZIP: 60606-6402 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.25 

<vi) CURRENT APPLICATION DATA: 
<A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Clough, David W. 

(B) REGISTRATION NUMBER: 36,107 

(C) REFERENCE/DOCKET NUMBER: 28003/31967/PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 312/474-6300 

(B) TELEFAX: 312/474-0448 

(C) TELEX: 25-3856 



(2) INFORMATION FOR SEQ ID NOtl: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
CAATTTCACA CAGGAAACAG CTATGTCTTT TCGCACGTTA GAAC 44 
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{2) INFORKIVTION FOR SEQ ID NO: 2: 

(1) SEQX7ENCE CHARACTERISTICS: 

<A) LENGTH: 5496 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



ATGTCTTTTC 


6CACGTTAGA 


ACTATTCGCC 


GGTATAGCTG 


GTATTTCACA 


TG6CCTCAGA 


60 


GGTATATCTA 


CACCAGTTGC 


ATTC6TAGAA 


ATTAAT6AA6 


ACGCACAAAA 


ATTCTTGAAA 


120 


ACAAAGTTTT 


CAGATGCATC 


TGTATTCAAT 


GACGTTAOGA 


AATTTACCAA 


ATCGGACTTC 


180 


CCAGAAGACA 


TAGACATGAT 


TACTGCGGGA 


TTCCC6TGCA 


CTGGGTTTAG 


TATTGCAGGT 


240 


TCTAGAACTG 


GATTCGAACA 


CAAGGAATCC 


GGTCTCTTTG 


CTGATGTTGT 


GCGAATCAOG 


300 


GAAGA6TATA 


AACCTAAAAT 


AGTGTTTTTG 


GAAAACTCCC 


ATATGTTGTC 


CCACACTTAC 


360 


AATCTCGATG 


TCGTCGTAAA 


AAAGATGGAT 


G AAATTGG TT 


ATTTCTGCAA 


GTGGGTAACT 


420 


TGTCGGGCAT 


CAATTATAGG 


AGCCCATCAT 


CAAC6CCACC 


GGTGGTTTTG 


TCTCGCGATT 


480 


CGAAAAGATT 


ATGAACCAGA 


AGAAATAATT 


GTATCTGTGA 


ATGCTACAAA 


GTTCGACTGG 


540 


6AAAATAAT6 


AACCACOGT6 


TCAAGTAGAC 


AATAAGAGTT 


ACGA6AATTC 


AACTCTTGTT 


600 


CGTCTGGCAG 


GATATTCCGT 


GGTCCCCGAC 


CAGATCAGAT 


ATGCTTTCAC 


CGGTCTATTT 


660 


ACA6GTGATT 


TTGAGTCATC 


GTGGAAAACT 


ACCTTGACAC 


CTGGGACAAT 


AATTGGCACG 


720 


GAACACAAAA 


AAATGAAAG6 


AACTTAOGAT 


AAA6TCATAA 


ACGGGTATTA. TGAGAACGAT 


780 


GTGTATTATT 


CTTTTTCAAG 


GAAAGAAGTT 


CATCGCGCTC 


CTCTAAATAT 


ATCCGTGAAA 


840 


CCACGTGATA 


TTCCGGAGAA 


ACATAACGGA 


AAAACACTCG 


TA6ATCGCGA 


AAT6ATCAAG 


900 


AAATATTGGT 


GCACACCATG 


TG CT AGTTAT 


GGCACTGCTA 


CTGCTGGATG 


CAATGTTCTG 


960 


ACAGACCGTC 


AGTCACATGC 


ACTTCCTACA 


CAAGTCAGGT 


TTTCATATAG 


GGGTGTATGT 


1020 


6GACGACATT 


TGTCTGGTAT 


ATGGTGT6CA 


TGGTTGATGC 


GGTATGACCA 


AGAATATCTT 


1080 


6GTTATTTGG 


TTCAATAT6A 


TTAAAATATT 


TTGATACACT 


AAATGGATAT 


AAGAAGAAAA 


1140 


C6TTTTACAA 


TAGAAG6GGC 


TAAACGTATA 


ATACTCGAAA 


AAAAGAGACT 


TGAAGAGAAA 


1200 


AAAAGAATTG 


CGGAAGAGAA 


AAAAAGAATT 


GCACTTATAG 


AAAAACAAC6 


AATTGCGGAA 


1260 


6ACAAAAAAA 


GAATTGCGGA 


AGAGAAAAAA 


OGATTCGCAC 


TTGAAGAGAA 


AAAAOGAATT 


1320 


GOGGAAGAAA 


AAAAACGAAT 


CGCGGAAGAG 


AAAAAAC6AA 


TCGTGGAAGA 


GAAAAAAAGA 


1380 


CTTGCACTTA 


TAGAAAAACA 


ACGAATTGCG 


GAAGAGAAAA 


TTGCGTOGGG 


GAGAAAAATT 


1440 
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AGAAAGAGGA 


TCTCTACAAA 


TGCAACAAAA 


CAT6AAAGAG 


AATTTGTCAA 


AGTTATAAAT 


1500 


TCAATGTTCG 


TCGGACCCGC 


TACTTTT6TA 


TTCGTAGATA 


TAAAA6GTAA 


TAAATCCAGA 


1560 


6AAATCCACA 


ACGTTGTAAG 


ATTCAGACAA 


TTACAAGGCA 


GTAAAGC6AA 


ATCCCCGACC 


1620 


GCGTATGTTG 


ATAGAGAATA 


TAACAAACCT 


AAAGCGGATA 


TAGCAGCGGT 


AGACATAACC 


1680 


GGTAAAGATG 


TGGCATGGAT 


ATCCCATAAA 


GCATCTGAAG 


GATATCAACA 


ATATCTAAAA 


1740 


ATTTCTGGAA 


AGAACCTCAA 


GTTCACAGGA 


AAAGAATTAG 


AAGAAGTTCT 


ATCGTTCAAG 


1800 


AGAAAAGTAG 


TTAGTATGGC 


ACCGGTATCT 


AAAATATGGC 


CTGCTAATAA 


6ACOGTATGG 


1860 


TCTCCTATCA AGTCAAATTT 


GATTAAAAAT 


CAAGCAATAT 


TCGGATTTGA 


TTACGGTAAG 


1920 


AAACCAGGAA 


6GGACAATGT 


AGACATCATA 


GGTCAAGGAC 


GACCAATTAT 


AACAAAAA6A 


1980 


GGTTCCATAT 


TATATCTTAC 


ATTCACTGGT 


TTTAGCGCAT 


TAAATGGGGA 


CTTGGAGAAT 


2040 


TTTACTGGGA 


AACATGAACC 


CGTTTTCTAT 


6TAAGAACAG 


AACGGAGTAG 


TAGCGGGAGA 


2100 


A6TATAACAA 


CTGTCGTCAA 


TGGTGTCACT 


TATAAAAATT 


TAAGATTCTT 


TATACATCCA 


2160 


TACAACTTTG 


TTTCTTCAAA 


AACACAACGT 


ATTATGTAGG 


ACCATTTTCC 


CGAGAGACTT 


2220 


TGTTGACCGC 


GTACTAAAAA 


ATGGTCACGA 


TATTTGTCTA 


AAGATGCTCA 


TAGAAGCAGG 


2280 


TGCAAACCTT 


GACATCGTCA 


CTGTTGAGTA 


TACACCATTA 


CATCTACATG 


TGGTGATATT 


2340 


TGTATAAAOG 


GTAAATACCT 


ATATATACAA 


TACGTATCCC 


CCTAAAAGCG 


CTTAGATTT7 


2400 


TTAGTT6TAT 


ACTACTTTTG 


TATAAGACCT 


GTAAGTTACA 


AACTAAAA6T 


TTCAGCTTTG 


2460 


CCTTCGAAAC 


AA6CAATTAC 


CGCATGAGAA 


TAATATCCAT 


TATGGATGTT 


TTCTGCTAAT 


2520 


AAAAOGATAT 


TTCCTACA6A 


AGTTTCTATG 


ATTAGTTCCG 


AAATATTGAG 


ATCATCGTCA 


2580 


CGTTTTTCTT 


TACC6TATTT 


TACTTTCGTG 


ATCGTC6CAC 


CAATAAAATC 


ATCTCGT6TG 


2640 


AGTTCATTCG 


GCAATTGTGC 


CGTGACACCA 


AATCTCTCAC 


AACAACCTTG 


ATGTCCATCC 


2700 


ATTGCTAACA 


CTATOGGTAA 


TCCATGT6TG 


6TGTGTACGA 


CCACACC6TT 


ATAACTATAA 


2760 


CACGTGTAGT 


TGTCGTCTAT 


ATCATATAAC 


TCGAGAGCGG 


TGTGAACTTC 


TTCAGATCTA 


2820 


TTATTAATCG 


GATCTGATCC 


ATAAGAAGAA 


TCTTCATATT 


TACAAATAAA 


ATCATCCGAT 


2880 


ATGTTCTGCA 


CACGAACAAC 


ATTCGTCAAA 


TTTCTGTGAT 


GACGAATCTC 


CATCTCTGAA 


2940 


TCATTA6AGA 


CTTG06A6TA 


TATAACATTA 


TAATTGTTGA 


TATGATTATT 


ACGTTTCATA 


3000 


TCAACAAAAT 


ACATATAAAC 


ACCATACAAA 


TATTAAAACA 


CGTTAGTATA 


TAATGGATAA 


3060 


CATTT6CAAT 


AGTATATTCA 


CTGCAGTAAA 


AAATGGCCAC 


GAAGCTTGTT 


T6AAGAT6AT 


3120 


GCTCArTGAA 


AGAGGTAGCA 


ATATCAATGA 


TGTTTCCGAA 


TCAAAATATG 


6AAATACACC 


3180 


ACTACATATT 


GCAGCTCATC 


ATGGTAATGA 


TGTGTGTTTG 


AAGATGCTTA 


TTGACGCAGG 


3240 


TGCAAACCTT 


GATATCACAG 


ATATTTCTG6 


AGGAACACCA 


CTTCATCGTG 


CGGTTTTGAA 


3300 
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TGGCCATGAC 


ATATTGTACA 


GATGCTCGTA 


GAAGCAGGTG 


CAAACCTTAG 


TATCATAACT 


3360 


AATTTGGGHT 


G6ATACOGTT 


ACATTAC6CG 


GCTTTTAATG 


GTAATGATGC 


GATTTT6AGG 


3420 


ATGCTCATCG 


TTGTAAGTGA 


TAATGTTGAC 


GTTATCAATG 


ATCGCGGTTG 


GACGGCGTTA 


3480 


CATTACGCGG 


CTTTTAATGG 


TCATAGCATG 


TGCGTCAAGA 


CGCTTATTGA 


TGOGGGTGCA 


3540 


JUITCTTGACA 


TCACAGATAT 


TTCGGGATGT 


ACACCACTTC 


ATCGTGCGGT 


TTATAATGAC 


3600 


CAOGATGCAT 


GTGTGAAGAT 


ACTCGTAGAA 


GCAGGTGCAA 


CTCTTGACGT 


CATTGATGAT 


3660 


ACTGAGTGGG 


TGCCGTTACA 


TTA060GGCT 


TTTAATGGTA 


ATGATGC6AT 


TTTGAGGATG 


3720 


CTCATTGAA6 


CAG6TGCAGA 


TATTGATATA 


TCTAATATAT 


GTGATTGGAC 


GGCGTTACAT 


3780 


TACGCGGCTC 


GAAAT6GACA 


OGATGTGTGT 


ATAAAAACAC 


TCATCGAAGC 


AGGT6GTAAC 


3840 


ATCAACGCOG TCAACAAATC GGGG6ATACA CCACTA6ATA TTGCAGCAT6 


TCATGACATT 


3900 


GCAGTAT6T6 


TGATCGTGAT 


AGTCAATAA6 


ATOGTTTCGG 


AGCGGCCGTT 


GCGTCCGAGT 


3960 


6A6TTGTOT6 


TCATACCACC 


AACGTCTGCT 


GCATTAGGTG 


ATGT6TTGCG 


AACGAC6ATG 


4020 


CGGCTTCATG 


GGCGATCGGA 


AGCTGCAAAG 


ATCACAGCGC 


ATCTTCCTGT 


GGGTGCAAGG 


4080 


GATACTCTAC 


GAACTACTGC 


GTTGTGTTTG 


AACCGAACAA 


TTTCCGAGAG 


ATCTCGTTGA 


4140 


TAGTGTATTA 


ATTGAATGCG 


TGTAAAGTTA 


CGCTATTTTT 


TTCCAAAAAG 


GGTTTGCAT6 


4200 


AAATACAACA 


CGATCTTTTG 


TAGATCGTTT 


ACCATTAGTT 


GTATTCGT6C 


AATAGAGACC 


4260 


ATAOGTACCT 


CCAAATTCAT 


TTACTTTACC 


TACA6TATTA 


CCACTTCCTT 


TTTTTCCTAT 


4320 


A6TA6TATCT 


AAATTCAACC 


CTTTGAACTC 


ATCGCCATTA 


ACAGACAGAG 


CGTATGAACC 


4380 


GTTTTGTGCC 


AATTTCACCT 


TCAAAAC6AT 


AGTAACCCAT 


TGACCTCTAG 


GAATTTTAAC 


4440 


CCATCTTATA 


AGTATCTGCT 


TACTTCCAAG 


TCCTTTTTCA 


AAAGCATACA 


ACGATCCTGT 


4500 


AAGGTTATCC 


CCAGAACCTG 


AAATTGTAAA 


GAACGACTGG 


AAATGAATAG 


GTTGCATTAG 


4560 


ATCTGTATAC 


ATATCACTTG 


GTTC3GAAATG 


AAAATCGTAG 


TCCCAATTAG 


GTACGTTCCA 


4620 


CCAAGTTTAA 


TACGGGGTCT 


TTCCACCGAG 


ACCGGACATT 


TCAGCACGAG 


CCTTGTAAGA 


4680 


ATGATATGAT 


GTGGTTAAAT 


CTCTATCACC 


ATCGTTCCAC 


TTTCCTCTGA 


ACCGAAGACC 


4740 


ATGCATCGTT 


ATACCTGGTG 


CAACCTGTAC 


TAAATTCTTT 


ATTTCAGGTG 


CGGCTCGGGG 


4600 


TGGATTAACT 


CGAGATTCGT 


CAAATCTAAA 


ATATGATAAC 


GATGTTCCAA 


CAGTAGAACC 


4860 


ACTGGGTGGT 


ATGGCAGTTG 


CTGGAAGGGA 


AGGTAAAACT 


TTAGGATATT 


TCAAATCACC 


4920 


AACACCTTGA 


6GGTTTACTT 


GAATACTTCT 


GGGAGATGTT 


GGTGGTTTCG 


TCGAAGGTGG 


4980 


TTTCGTTGAA 


GGTGGTTTCG 


TCGAAGGTGG 


TTTCGTCGAA 


GGTGGTTTCG 


TCGAAGGTGG 


5040 


TTTCGTCGAA 


GGT6GTTTOG 


TOGAAGGTGG 


TTTCGTCGAA 


GGTGGTTTCG 


TCGAAGGTGG 


5100 


TTTCGTCGAA 


GGTGGTTTCG 


TCGAAGGTGG 


TTTCGTCGAA 


GGTGGTTTCG 


TCGAAGGTGG 


5160 
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TTTCGTCGAH G6T6GTTT06 T06AA6GTGG TTTCGTTGGC GGAAGT6GGG CATGACCATA 5220 

ATCCGTTAAA TTCCCGC31TT CACCTAATGA TGTACTCCAT AAAGAACCCG GTGCGCATTG 5280 

CATTCTTATT G6TTCTGTAG TATCAGATAT ACATACGAAA TAATGAGAAT CATTTTCCCT 5340 

GCCAAATAAT TTACCAGATT TGCCTTTACA TGACATTATT TGTAATATAA TATTATTATA 5400 

ATTTTAMAA AACTAACGTC TATTTAAAAT TATGTAATAC GTATTATATC AATGCATCAT 5460 

CTTAATCATT TCCTAACGTA TAAG06TAGC GAATTC 5496 
(2) INFORMATION FOR SEQ ID NO: 3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1225 base pairs 

(B) TYPE: nucleic acid 
<C) 5TRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join(1..33, 55.. 1128) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CAA 6AA TAT CTT GGT TAT TT6 6TT CAA TAT GAT TAAAATATTT TGATACACTA 53 
Gin Glu Tyr Leu Gly Tyr Leu Val Gin Tyr Asp 
15 10 

A ATG GAT ATA AGA AGA AAA CGT TTT ACA ATA GAA GGG GCT AAA CGT 99 
Met Asp lie Arg Arg Lys Arg Phe Thr lie Glu Gly Ala Lys Arg 
15 20 25 

ATA ATA CTC GAA AAA AAC AGA CTT GAA GAG AAA AAA AGA ATT GCG GAA 147 
lie lie Leu Glu Lys Lys Arg Leu Glu Glu Lys Lys Arg lie Ala Glu 
30 35 40 

GAG AAA AAA AGA ATT GCA CTT ATA GAA AAA CAA CGA ATT GCG GAA GAG 195 
Glu Lys Lys Arg lie Ala Leu lie Glu Lys Gin Arg lie Ala Glu Glu 
45 50 55 

AAA AAA AGA ATT GCG CAA GAG AAA AAA CGA TTC GCA CTT GAA GAG AAA 243 
Lye Lys Arg lie Ala Glu Glu Lys Lys Arg Phe Ala Leu Glu Glu Lys 
60 65 70 

AAA CGA ATT GCG GAA GAA AAA AAA CGA ATC GCG GAA GAG AAA AAA CGA 291 
Lys Arg lie Ala Glu Glu Lys Lys Arg lie Ala Glu Glu Lys Lys Arg 
75 80 85 90 

ATC GTG GAA GAG AAA AAA AGA CTT GCA CTT ATA GAA AAA CAA CGA ATT 339 
lie Val Glu Glu Lys Lys Arg Leu Ala Leu lie Glu Lys Gin Arg lie 
95 100 105 

GCG GAA GAG AAA ATT GCG TCG GGG AGA AAA ATT AGA AAG AGG ATC TCT 387 
Ala Glu Glu Lys lie Ala Ser Gly Arg Lys lie Arg Lys Arg lie Ser 
110 115 120 
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ACA AAT 6CA ACA ATkA CAT GAA AGA GAA TTT 6TC AAA GTT ATA AAT TCA 435 
Thr Asn Ala Thr Lye His Glu Arg GXu Phe Val Lye Val He Asn Ser 
125 130 135 

ATG TTC GTC GGA CCC GCT ACT TTT GTA TTC GTA GAT ATA AAA GGT AAT 483 
Met Phe Val Gly Pro Ala Thr Phe Val Phe Val Asp He Lys Gly Asn 
140 145 150 

AAA TCC AGA GAA ATC CAC AAC GTT GTA AGA TTC AGA GAA TTA CAA GGC 531 
Lye Ser Arg Glu lie His Asn Val Val Arg Phe Arg Gin Leu Gin Gly 
155 160 165 170 

AGT AAA GCG AAA TCC COG ACC GCG TAT GTT GAT AGA GAA TAT AAC AAA 579 
Ser Lys Ala Lya Ser Pro Thr Ala Tyr Val Asp Arg Glu Tyr Asn Lys 
175 180 185 

CCT AAA GCG GAT ATA GCA GCG GTA GAC ATA ACC GGT AAA GAT GTG GCA 627 
Pro Lys Ala Asp He Ala Ala Val Asp He Thr Gly Lys Asp Val Ala 
190 195 200 

TGG ATA TCC CAT AAA GCA TCT GAA GGA TAT CAA CAA TAT CTA AAA ATT 675 
Trp He Ser His Lye Ala Ser Glu Gly Tyr Gin Gin Tyr Leu Lye He 
205 210 215 

TCT GGA AAG AAC CTC AAG TTC ACA GGA AAA GAA TTA GAA GAA GTT CTA 723 
Ser Gly Lys Asn Leu Lye Phe Thr Gly Lys Glu Leu Glu Glu Val Leu 
220 225 230 

TCG TTC AAG AGA AAA GTA GTT AGT ATG GCA CCG GTA TCT AAA ATA TGG 771 
Ser Phe Lys Arg Lys Val Val Ser Met Ala Pro Val Ser Lys He Trp 
235 240 245 250 

CCT GCT AAT AAG ACC GTA TGG TCT CCT ATC AAG TCA AAT TTG ATT AAA 819 
Pro Ala Asn Lys Thr Val Trp Ser Pro He Lys Ser Asn Leu He Lys 
255 260 265 

AAT CAA GCA ATA TTC GGA TTT GAT TAC GGT AAG AAA CCA GGA AGG GAC 867 
Asn Gin Ala He Phe Gly Phe Asp Tyr Gly Lys Lys Pro Gly Arg Asp 
270 275 280 

AAT GTA GAC ATC ATA GGT CAA GGA CGA CCA ATT ATA ACA AAA AGA GGT 915 
Asn Val Asp He He Gly Gin Gly Arg Pro He He Thr Lye Arg Gly 
265 290 295 

TCC ATA TTA TAT CTT ACA TTC ACT GGT TTT AGC GCA TTA AAT GGG CAC 963 
Ser He Leu Tyr Leu Thr Phe Thr Gly Phe Ser Ala Leu Asn Gly His 
300 305 310 

TTG GAG AAT TTT ACT GGG AAA CAT GAA CCC GTT TTC TAT GTA AGA ACA 1011 
Leu Glu Asn Phe Thr Gly Lys His Glu Pro Val Phe Tyr Val Arg Thr 
315 320 325 330 

GAA CGG AGT AGT AGC GGG AGA AGT ATA ACA ACT GTC GTC AAT GGT GTC 1059 
Glu Arg Ser Ser Ser Gly Arg Ser He Thr Thr Val Val Asn Gly Val 
335 340 345 

ACT TAT AAA AAT TTA AGA TTC TTT ATA CAT CCA TAC AAC TTT GTT TCT 1107 
Thr Tyr Lys Asn Leu Arg Phe Phe He His Pro Tyr Asn Phe Val Ser 
350 355 360 





wo 94/21663 



PCT/US94/03246 



- 87- 



TCA AAA ACA CAA CGT ATT ATG TAGGACCATT TTCCCGAGA6 ACTTTGTTGA 
Ser Lye Thr Gin Arg lie Met 
365 



1158 



CCGCGTACTA AAAAATGGTC AC6ATATTTG TCTAAAGATG CTCATAGAAG CAGGTGCAAA 1218 



(2) INFORMATION FOR SEQ ID NOt4: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 amino acids 

(B) TSTPE: amino acid . 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: protein 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Gin Glu Tyr Leu Gly Tyr Leu Val Gin Tyr Asp Met Asp lie Arg Arg 
15 10 15 

Lys Arg Phe Thr He Glu Gly Ala Lys Arg He He Leu Glu Lys Lys 
20 25 30 

Arg Leu Glu Glu Lys Lys Arg He Ala Glu Glu Lys Lys Arg He Ala 
35 40 45 

Leu He Glu Lys Gin Arg He Ala Glu Glu Lys Lys Arg He Ala Glu 
50 55 60 

Glu Lys Lys Arg Phe Ala Leu Glu Glu Lys Lys Arg He Ala Glu Glu 
65 70 75 80 

Lys Lys Arg He Ala Glu Glu Lys Lys Arg He Val Glu Glu Lys Lys 
85 90 95 

Arg Leu Ala Leu He Glu Lys Gin Arg He Ala Glu Glu Lys He Ala 
100 105 110 

Ser Gly Arg Lys He Arg Lys Arg He Ser Thr Asn Ala Thr Lys His 
115 120 125 

Glu Arg Glu Phe Val Lys Val He Asn Ser Met Phe Val Gly Pro Ala 
130 135 140 

Thr Phe Val Phe Val Asp He Lys Gly Asn Lys Ser Arg Glu He His 
145 150 155 160 

Asn Val Val Arg Phe Arg Gin Leu Gin Gly Ser Lys Ala Lys Ser Pro 
165 170 175 

Thr Ala Tyr Val Asp Arg Glu Tyr Asn Lys Pro Lys Ala Asp He Ala 
180 185 190 

Ala Val Asp He Thr Gly Lys Asp Val Ala Trp He Ser His Lys Ala 
195 200 205 

Ser Glu Gly Tyr Gin Gin Tyr Leu Lys He Ser Gly Lys Asn Leu Lys 
210 215 220 



CCTTGAC 



1225 
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Phe Thr Gly Lys Glu Leu Glu GXu Val Leu Ser Phe Lye Arg Lye Val 
225 230 235 240 

Val Ser Met Ala Pro Val Ser Lys lie Trp Pro Ala Asn Lya Thr Val 

245 250 255 

Trp Ser Pro lie Lys Ser Asn Leu lie Lys Asn Gin Ala tie Phe Gly 
260 265 270 

Phe Asp Tyr Gly Lye Lys Pro Gly Arg Asp Asn Val Asp lie He Gly 
275 280 285 

Gin Gly Arg Pro He He Thr Lys Arg Gly Ser He Leu Tyr Leu Thr 
290 295 300 

Phe Thr Gly Phe Ser Ala Leu Asn Gly His Leu Glu Asn Phe Thr Gly 
305 310 315 320 

Lys His Glu Pro Val Phe Tyr Val Arg Thr Glu Arg Ser Ser Ser Gly 
325 330 335 

Arg Ser He Thr Thr Val Val Asn Gly Val Thr Tyr Lys Asn Leu Arg 
340 345 350 

Phe Phe He His Pro Tyr Asn Phe Val Ser Ser Lys Thr Gin Arg He 
355 360 365 

Met 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
GTAAAACGAC GGCCAGT 17 
(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQX7ENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCCAAGCTTG GATGAT 



16 
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(2) ZN70RM21TXON FOR SEQ ZD NO: 7: 



(1) SEQUENCE CHARACTERZSTZCS; 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECUI^ TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
ATCTTCGCGA ATTCACTGGC CGTCGTTTTA C ^ 31 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE5S: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GAATTCGC6A A6AT 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ATCATCCAAG CTTG6CACTG GCCGTCGTTT TAG 33 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTAAAACGAC GGOCAGT6AA TTC6C6AAGA TNNNNNNNNN NNNNNNNNAT CATCCAAGCT 60 
TGGCACTGGC CGTCG T TT T A C 81 
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(2) INFORMATION FOR SEQ ID HO: 11: 

<1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECXILE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
GTAAAACGAC 66CCAGT6CC AA6CTTGGAT GATNNNNNNN NNNNNNNNNN ATCTTCGC6A 60 
ATTCACTG6C CGTCGTTTTA C 81 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: join(26 . . 146 r 190.. 207, 244.. 270) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

TAACAATTTC ACACA6GAAA CAGCT ATG ACC AT6 ATT ACG CCA AGC TCG AAA 52 

Met Thr Met He Thr Pro Ser Ser Lys 
1 5 

TTA ACC CTC ACT AAA GGG AAC AAA AGC TGG TAG CGG GGC CCC CCC TCG 100 
Leu Thr Leu Thr Lys Gly Asn Lys Ser Trp Tyr Arg Gly Pro Pro Ser 
10 15 20 25 

AGG TCG ACG GTA TCG ATA AGC TT6 ATA AAC CAT TTA TAC AAT AAG CGT 148 
Arg Ser Thr Val Ser He Ser Leu He Asn His Leu Tyr Asn Lys Arg 
30 35 40 

TGATATAAGT TTGTATATAC GTCATTTCGT TATATCAACA A ATG TTA TCA TAT 201 

Met Leu Ser Tyr 
45 

TAT ACG TAAAACTGGC TTAAAAAAAA ACGAGGTGTA ACTATA ATG TCT TTT CGC 255 
Tyr Thr Met Ser Phe Arg 

50 



ACG TTA GAA CTA TTT 
Thr Leu Glu Leu Phe 
55 



270 
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(2) ZNFORHHTZON FOR SBQ ZD NO: 13: 

(1) SEQUENCE CHARACTERZSTZCS: 

.(A) LENGTH: 56 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRZPTZON: SEQ ZD NO: 13: 

Met Thr Met lie Thr Pro Ser Ser Lye Leu Thr Leu Thr Lys Gly Asn 
15 10 15 

Lys Ser Trp Tyr Arg Gly Pro Pro Ser Arg Ser Thr Val Ser Zle Ser 
20 25 30 

Leu Zle Asn Hie Leu Tyr Asn Lys Arg Met Leu Ser Tyr Tyr Thr Met 
35 40 45 

Ser Phe Arg Thr Leu Glu Leu Phe 
50 55 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCT Rule nbis) 



A. The indications made belc^ relate to the rotcroorganism referred to in the description 



on page 



. line 



13 



B. IDENTIFICATION OF DEPOSIT 



Further depositt are identified oa an additiosal sheet 



ID 



Name of deposiury institution 

American Type Culture Collection 



Addreu of deposiury institution (imcbiiingpoucl code and country) 

12301 Parklawn Drive 
Rockville. Maryland 20852 
UNITED STATES OF AMERICA.. 



Date of deposit 



November 6, 1992 



Accession Number 

A.T.C.C. 75354 



C. ADDITIONAL INDICATIONS iUmm biamk ifmai applicable) Th'a uifonBation b contiiiued on an additional sheet Q 



'*In respect of those designations in which a European patent is sought, 
a sample of the deposited microorganism will be made available until the 
publication of the mention of the grant of the European patent or tintil the 
date on which the application has been refused or withdrawn or is deemed to 
be wlthdra%ra» only by the issue of such a sample to an expert nominated by 
the person' requesting the sample (Rule 23(4) EPC)*" 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE (if Uk u^dkaatms mrr mat ^rmU daigmat^dSiasa) 



E. SEFAStATETtmSiSBISGOriNDlCAriOf&ilca^bkmk^mMmpplieatl^ 



Tbe indications listed below will be submined to the Intefnational Bureau hitcr Upedfytkeg ma al maum of tkci 
NumbatofDtpaa^ 



For receiving OCHce use only 



\J(^ This sheet was reoeived with tb^Jnteniational appli cation 



^ACimonzed o£Cic9 



For Internationa] Bureau use only 



I I This sheet was received by the International Bureau on: 



Authorized officer 



Form PCr/RO/134 (July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCr Rule \3bb) 



A. The indications made below relate to the microorganism referred to in the description 
on page 79 li"^ ^ 



B. IDENTIFICATION OF DEPOSIT 



Further deposiu are identified on an additional sheet pc] 



Name of deposiury insttcuuon 

American Type Culture Collection 



Address of deposiury institution (imiudutg posui codscmd century^ 

12301 Parklawn Drive 
Rockville, Maryland 20852 
UNITED STATES OF AMERICA 



Dale of deposit January 21 » 1993 



Accession Number 



A-T.C.C. 75399 



C. ADDmONAL INDICATIONS ffc«r Ww^i/— cpplicabU) ThiM information U continued on an additional sheet □ 



"In respect of those designations in which a European Patent is sought, 
a sample of the deposited microorganism will be made available 
JuSSation of the mention of the grant of the European patent or u«il the 
date on which the application has been refused or ^^^""^^f^^^'f " 
be withdrawn, only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(4) EPC) > 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE fifiheiMdk^mimmMM^mU^puuedSi^) 



£• SEPARATE FURNI5HINC OF INDICATIONS f/owr bhuik if ma ^ppGeshk^ 



Tlie indications listed bctowwiH be autamiticd to the international Bureau )zta (specify ikegc^li^cfihMiMJUa^ -Ae-wi 



For receiving Of&ce use only 



I Tbis sheet was reaived with the tntematio nal applicatio n 




For International Bureau use only 



Q This sheet was rcoeived by die International Bureau on; 



Authorized olEioer 



Fonn PCr/Ra/134(July 1992) 
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INDICATIONS RELATING TO A DEPOSITED MICROORGANISM 

(PCX Rule I3bis) 



A. The indications made below reiate to the microorganjsni referred lo tn the description 



on page 



31 



, line 



25 



B. IDENTIFICATION OF DEPOSIT 



Funher deposits are identified on an additional abeet 



Name of depositary institution 



American Type Culture Collection 



Address of deposiury institution {ixiuMng poaol code and cauiuiy) 

12301 Parklawn Drive 
RocVcvUle, Maryland 20852 
UNITED STATES OF AMERICA 



Date of deposit 



June 30, 199A 



Accession Number 

A.T.C-C. 693A1 



C* ADDITIONAL INDICATIONS {ieavK blank if net applicable) This information U continued on an additional sbeei 



"In respect of those designations In which a European patent is sought, 
a sample of the deposited microorganism vill be made available until the 
publication of the mention of the grant of the European patent or until the 
date on which the application has been refused or withdrawn or is deemed to 
be withdrawn* only by the issue of such a sample to an expert nominated by 
the person requesting the sample (Rule 23(A) EFC)." 



D. DESIGNATED STATES FOR WHICH INDICATIONS ARE MADE nftkeiiidkauotamrmamformUdaiptaiedStatei) 



E. SEPARATE FURNISHING OF INDICATIONS Uenye blank if net applicable) 



The indications listed below will be submitted to the international Bureau later (tpedfytkegatmi, 
NMoAar^Depetit') 



aatMfeeftheiadimtHmt e,g^ *AeceBion 



For receiving OCnoe use only 




IJH Tbis sbeet wis received lyitb tbe tntenuttonal af 



For Interna lional Bureau use only 



I I Tbis sbeet was received by tbe International Bureau on: 



Auibortzed officer 



Fomi PCr/RCV134(July 1992) 
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WE CLAIM: 

1. A purified and isolated polynucleotide mcoding a CViII 
polypeptide or a variant thereof possessing activity characteristic of CVlFI, said 
polynucleotide comprising a polynucleotide as set out in SEQ ID NO: 2. 



2. The polynucleotide of claim 1 which is a DNA. 



3. The DNA of claim 2 which is a viral genomic DNA 
sequence or a biological replica thereof. 

4. The DNA of claim 2 which is a wholly or partially 
chemically synthesized DNA or biological replica thereof. 

5. A purified isolated DNA oicoding a polypeptide according 
to daim 1 by means of degenerate codons. 

6. A vector comprising a DNA according to claim 2. 

7. The vector of claim 6 which is the plasmid pCJHl .4 (ATCC 
Accession No. 69341). 

8. A host cell stably transformed or transfected with a DNA 
according to claim 2 in a manner allowing the expression in said host cell of a 
CviJI polypeptide or a variant thereof possessing a sequence specificity 
characteristic of CViJI. 

9. The host cell according to claim 8, wherein said host cell 

isE, coli. 
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10. A method for producing a CvzJI polypeptide or a variant 
thereof possessing biological activity specific to CvUl, said method comprising the 
steps of: 

a) growing a transformed host cell containing a vector 
according to claim 6 in a suitable nutrient medium; and 

b) isolating the CViJI polypeptide or variant thereof from 

said host cell. 

11. The method of claim 10 wherein said host cell is £. coU. 

12. A recombinant CVzJI polypeptide. 

13. A polypeptide produced by the method of claim 10. 

14. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is cleaved at a dinucleotide sequence selected 
from the group consisting of PyGCPy, iPuGCPy, PuGCPu, and wherein Pu = 
purine and Py = pyrimidine. 

15. A method for restriction endonuclease digestion of DNA 
comprising the step of digesting DNA with a restriction endonuclease reagent 
under conditions wherein said DNA is digested at 11 of 16 possible dinucleotide 
sequences and wherein said dinucleotide sequences are selected from the group 
consisting of PuCGPu, PuCGPy, PyCGPy and PyCGPu, and wherein Pu = 
purine and Py = pyrimidine. 

16. The meftod according to claim 14 wherein said restriction 
endonuclease reagent comprises CviJ I. 
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17. A restriction endonuclease reagmt, said restriction 
endonuclease reagent comprising in combination, Taq I and Hpa n (CGase I), 
said reagent capable of digesting DNA at 11 of 16 possible dinucleotide 
sequences, said sequences selected firom the group consisting of PuCGPu, 
PuCGPy, PyCGPy and PyCGPu, and wherein Pu = purine and Py = pyrimidine. 

18. The method according to claim 15 wherein said restriction 
endonuclease reagent is selected from the group consisting of Aci I and CGase I. 



19. The method according to claim 16 wherein said digestion 
of DNA is a partial digestion and wherein said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1-2 wt. % agarose 
gel. 

20. The method according to claim 18 whndn said digestion of 
DNA is a partial digestion and wher^ said digestion generates quasi-random 
fragments of DNA without apparent site preference as seen on a 1-2 wt. % agarose 
gel. 

21. The method according to claims 16 or 18 wherein said 
digestion is complete, and wherein said digestion generates DNA fragments from 
about 20 base pairs in length to about 200 base pairs in length and wherein said 
fragments have an average length of about 20 to about 60 nucleotides. 

22. The method according to claims 19 or 20 wherein said quasi- 
random fragments are from about 100 bas^airs to about 10,000 base pairs in 
length. 
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23. A method for shotgun cloning and sequ^icing DNA, 
comprising the stq>s of: 

a) partially digesting DNA according to claims 19 or 20; 

b) ligating said partially digested DNA into a linearized 
cloning vector thereby creating a recombinant vector; 

c) introducing said recombinant vector into a host cell; 

d) selecting said host cell for the presence of said recombinant 
vector; 

e) growing and amplifying said host cell containing said 
recombinant vector; 

f) isolating and purifying said recombinant vector from said 
grown and amplified host cells; and 

g) sequencing said DNA contained in said recombinant vector. 

24. The method according to daim 23 wherdn said restriction 
endonuclease reagent comprises CvU I. 

25. The method according to claim 23 wherein said restriction 
mdonuclease reagent comprises CGase I. 

26. The method according to claim 23 wherein said quasi-random 
fragments are from about 100 base pairs to about 10,000 base pairs in length. 

27. The method according to claim 23 wherein said quasi-random 
fragmrats are from about 500 bp to about 2,000 bp in length. 



28. The method according to claim 23 wherein said cloning vector 
is selected from the group consisting of plasmids, phage, and cosmids. 
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29. The method accoiding to claim 28 wherein said plasmid is 

pUC19. 

30. The method according to claim 28 wheiein said bacteriophage 

is \. 

31. The method according to claim 28 wherein said bacteriophage 

is M13. 



32. The method according to claim 23 wherein said host cell is a 

bacteria. 

33. The method according to claim 32 wherein said host cell is E. 

colu 

34. The method according to claim 23 wherdn said sequencing is 
dideoxy sequencing. 

35. A kit for the shotgun cloning of DNA, said kit comprising in 

association: 

a) a restriction endonudease reagent, according to 
claims 16 or 18; 

b) a restriction endonudease buffer; 

c) ligation buffer; and 

d) T4 DNA ligase. 



wo 94/216d3 PCTAJSM/03246 



1 00 



36. The kit of claim 35 further comprising in association: 

e) competent host bacteria; 

f) chromatography matrix said matrix useful for the size 
selection of restriction endonuclease digested DNA; 

g) spin filters, said spin filters useful for the size selection of 
restriction endonuclease digested DNA; 

h) a cloning vector; 

i) positive control DNA useful in the monitoring of the 
efficiency of the said shotgun cloning; and 

j) molecular size marker DNA. 

37. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CviJ I. 

38. The kit according to claim 37 wherein said restriction 
endonuclease buffer endonuclease buffer is CviJ l'^'*' buffer. 

39. The kit according to claim 35 wherein said restriction 
endonuclease reagent comprises CGase I. 



40. The kit according to claim 39 wherein said restriction 
oidonuclease buffer is CGase I buffer. 

41. The kit according xo claim 36 wherein said competent host 
bacteria is competrat £. coli DHSap'. 

42. The kit according to claim 36 wherein said chromatography 
matrix is Sq)hacryl-S500. 
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43. The kit according to claim 36 wherein said cloning vector is 

MI3 mpl8. 

44. A method for labeling DNA, the method comprising the steps 

of: 

a) digesting an aliquot of template DNA with a restriction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and sequence-specific DNA ftagmmts thereby 
generating denatured template DNA and oligonucleotide primers. 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymmse in the presmce of 
one or more nucleotide triphosphates and wherdn at least one 
nucleotide triphosphate has a label. 

45. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CviJ I. 

46. The method according to claim 44 wherein said restriction 
endonuclease reagent comprises CGase I. 



47. The method according to claim 44 wherein said extension 
reaction is performed by a DNA polymerase. 
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48. The method according to claim 47 wherein said DNA 
polymerase is Thermus flaws DNA polymerase. 

49. The method according to claim 44 wherein the one or more 
nucleotide triphosphates are selected from the group consisting of dATP, dCTP, 
dGTP, dUTP and dTTP. 

50. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of ^^P-labeled 
nucleotide triphosphates and ^•'p-labeled nucleotide triphosphates. 

51. The method according to claim 44 wherein said labeled 
nucleotide triphosphate is selected from the group consisting of biotin-labdled 
nucleotide triphosphates, florescein-labeled nucleotide triphosphates, 
dinitrpphenol-labeled nucleotide triphosphates, and digoxigmin-Iabeled nucleotide 
triphosphates. 
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52. A method for thermal cycle labeling DNA comprising the 

steps of: 

a) digesting an aliquot of template DNA with a restiiction 
endonuclease reagent according to claim 21 and wherein said 
digestion generates sequence-specific DNA fragments; 

b) mixing an aliquot of undigested template DNA with said 
sequence-specific DNA fragments, denaturing said mixture of 
template DNA and said DNA fragments thereby generating 
denatured template DNA and oligonucleotide primers; 

c) annealing said primers to said denatured undigested template 
DNA to form a DNA-primer complex; 

d) performing an extension reaction from said primers in said 
DNA-primer complex using a DNA polymerase in the presence of 
one or more nucleotide triphosphates and wherein at least one 
nucleotide triphosphate has a label. 

e) heat-denaturing said labeled extension products; 

f) reannealing said excess primers witii said template DNA 
and with said extension products; 

g) performing at least one additional extension reaction from 
said DNA-primer complex using a DNA polymerase. 

53. The method according to claim 52 wherein said restriction 
endonuclease reagmt comprises CviJ I. 

54. The method according to claim 52 wherein said restriction 
endonuclease comprises CGase I. 

55. The method according to claim 52 wherein said DNA 
polymerase is a heat stable DNA polymerase. 
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56. The method according to claim 55 wherein said heat-stable 
DNA polymerase is Thermus flavus DNA polymerase or a functional fragmmt 
thereof. 

57. The method according to claim 52 wherein said extension 
products also serve as templates. 

58. The method according to claim 52 wherein said label is 
selected from the group consisting of fluorescein, dinitrophenol, biotin, and. 
digoxigenin. 

59. The method according to claim 52 wherein said label is 
sdected from the group consisting of ^^P, ^^P, ^H, ^^C, and ^^S. 

60. The method according to claim 52 wherein steps e)-g) are 
repeated up to 20 times. 



61. A kit for labeling DNA, said kit comprising in association: 

a) a restriction endonuclease reagent, according to 
claims 16 or 18; 

b) a restriction endonuclease buffer; and 

c) a labeling buffer. 



62. The kit according to claim 61 wherein said restriction 
endonuclease reagent comprises CviJ I. 



63. The kit according to claim 62 wherein said restriction 
endonuclease buffer is CviJ l'*' restriction endonuclease buffer. 
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64. The kit accoiding to claim 61 wheiein said restriction 
endonuciease reagent is selected from the group consisting of CGase I and Aci I. 

65. The kit according to claim 64 wherein said restriction 
endonuciease buffer is CGase I buffer. 

66. The kit of claim 64 further comprising: 

d) a concmtrated mixture of 1 or more nucleotide 
triphosphates; 

e) a DNA polymerase; 

f) control DNA, said control DNA being useful for monitoring 
the efficiency of labeling. 

67. The kit according to claim 66 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides selected from the group 
consisting of dCTP, dTTP, dATP, and dOTP. 

68. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-ll-dUTP, digoxigenin-11- 
dUTP and fluorescein- 11-dUTP. 

69. The kit according to claim 66 additionally comprising a labeled 
nucleotide selected from the group consisting of ^^P-labeled nucleotides, ^^P- 
labeled nucleotides, ^^C-labeled nucleotides, ^^S-labeled nucleotides, and Un- 
labeled nucleotides. 

70. The kit according to claim 66 wher^ said DNA polymerase 
is the Klenow fragment of DNA polymerase 1. 
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71. The kit according to claim 66 wherein said DNA polymerase 
is a thermostable DNA polym^ase. 



72. The kit according to claim 66 wherein said thermostable DNA 
polymerase is Themtus flavus DNA polymerase. 



73. A method for imiversal thermal cycle labelling DNA 



comprising the steps of: 



a) mixing an aliquot of template DNA with a holo- 
enzyme of a thermostable DNA polymerase, whereby the 
polymerase provides endogenously purified DNA primers; 

b) denaturing said mixture of template DNA and said 
^dogenous DNA primers; 

c) annealing said mixture of denatured template DNA 
and said endogenous DNA primers to form a DNA-primer 
complex; 

d) p^orming an extension reaction from said 
radogenous DNA primers in said DNA-primer complex 
using said DNA polymerase in the presence of one or more 
nucleotide triphosphates and wherein at least one nucleotide 
triphosphate has a label; 

e) heat-denaturing said labeled extension products; 

f) reannealing said endogenous primers with said 
template DNA and with said extension products; 

g) performing at least one additional extension reaction 
from said DNA-primer complex using a DNA polymerase. 
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74. The method according to Claim 73 wherein said heat-stable 
DNA poljrmerase is Thermus flams DNA polymerase or a functional fragment 
thereof. 

75. The method according to claim 73 wherein said extension 
products also serve as templates. 

76. The method according to claim 73 wherein said label is 
selected from the group consisting of fluorescein, dinitrqphenol, biotin, and 
digoxigenin. 

77. The method according to claim 73 wherein said label is 
selected from the group consisting of ^^P, ^^P, ^*C, and ^^S. 



78. The method according to claim 73 whendn stq)s e)-g) are 



79. A kit for labeling DNA» said kit comprising in association: 

a) a holo-enzyme of a thermostable DNA polymerase; 
and 

b) a DNA pol3nnerase buffer. 

80. The kit of claim 79 further comprising: 

c) a concentrated mixture of 1 or more nucleotide 
triphosphates; 

d) control DNA> said control DNA being useful for monitoring 
the efficiency of labeling. 



rqieated up to 20 times. 
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81. The kit according to claim 80 wherein said nucleotide mixture 
is an equimolar mixture of one or more nucleotides sdected from the group 
consisting of dCTP, dTTP, dATP, and dGTP. 

82. The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of biotin-ll-dUTP, digoxigenin-11- 
dUTP and fluorescein- 11-dUTP. 

83. The kit according to claim 80 additionally comprising a labeled 
nucleotide selected from the group consisting of •'^P-labeled nucleotides, ^^P- 
labeled nucleotides, ^^C-labeled nucleotides, ^^S-labeled nucleotides, and 
labeled nucleotides. 

84. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Thermus agiuuicus DNA polymerase. 

85. The kit according to claim 80 wherein said thermostable DNA 
polymerase is Thermus flavus DNA polymerase. 

86. A method for labeling of restiiction-gen^ted oligonucleotides, 
the method of comprising the steps of: 

a) digesting an aliquot of template DNA according to 
claim 21; 

b) heat denaturing said digested DNA thereby generating 
sequence-specific oligonucleotides; and 

c) labeling said sequence-specific oligonucleotides with 
a label capable of detection. 
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87. The methcxl according to claim 86 wherein said restriction- 
generated oligonucleotides are labeled on the 5' end. 

88. The method according to claim 86 wherein said restricdon- 
geneiated oligonucleotides are labeled on the 3' end. 

89. The method according to daim 86 wherein the label is 

radioactive. 

90. The method according to claim 86 wherein the label is non- 
radioactive. 
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91. A method for anonymous primer cloning, the method 
comprising the steps of: 

a) digesting an aliquot of template DNA according to claim 21 
thereby generating anonymous DNA fragments; 

b) digesting a plasmid cloning vector with a restriction 
endonuclease thereby creating a cloning site for insertion of said 
anonymous DNA fragments; 

c) iigating the anonymous DNA fragments of step a) into the 
cloning site of step b) thereby creating recombinant plasmids; 

d) transforming competent bacteria with the recombinant 
plasmids; 

e) selecting trasformed colonies; 

f) purifying the recombinant plasmids from said transformed 
bacteria; 

g) digesting the recombinant plasmid with a restriction 
endonuclease said restriction endonuclease being capable of cutting 
said recombinant plasmid at a site, said site lying within the cloned 
anonymous DNA fragment; 

h) annealing one or more extension primers to the digested 
recombinant plasmid, said extension primers being complementary 
to plasmid sequences flanking the anonymous primer; 

i) extending the exten^on primer in a template-dependent 
fashion in the presence of one or more nucleotide triphosphates and 
a DNA polymerase; and 

j) denaturing the said hybridized extended primer. 

92. The method according to claim 91 wherein said restriction 
endonuclease reagent comprises CviJ 1. 
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93. The method according to claim 91 wherdn said restriction 
endonuclease reag^t comprises CGase I. 

94. The method according to claim 91 wherein said plasmid 
cloning vector is pFEM. 

95. The method according to claim 94 wherein the restriction 
endonuclease of step b) is Eco RV. 



96. The method according to claim 91 wherein said extension 
prim^ has a label capable of detection. 

97. A kit for anonymous primer cloning comprising in association: 

a) a restriction endonuclease reagmt, according to claims 16 or 
18; 

b) a restriction endonuclease buffer; 

c) a cloning vector; 

d) competent bacteria; 

e) one or more extension primers said extension primers bring 
complementary to plasmid sequ«ces flanking said anonymous 
primers; and 

f) a DNA polymerase reagent. 

98. The kit according to claim 97 wherein said restriction 
endonuclease reagent comprises CvU I. 



99. The kit according to claim 98 wherein said restriction 
endonuclease buffer is CviJ I*^ buffer. 
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100. The Idt according to claim 97 wherein said restriction 
endonuclease reagent is selected from the group consisting of CGase I and Aci I. 

101. The kit according to claim 100 wherein said restriction 
endonuclease buffer is CGase I buffer. 

102. The kit according to claim 97 wherein said cloning vector is 



pFEM. 
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lacZ* 

TA ACAATTTCACACAGGAAACAGCT ATG ACC ATG ATT ACQ CCA AGC TCG AAA TTA 

MTMITPSSKL 

Xhdi 

ACC CTC ACT AAA 6GG AAC AAA AGC TGG TAC CGG GGC CCC CC CTCG AGG TCG 
T LTKG NKS WYRG PP SRS 

ACQ GTA TCG ATA AGC TTG ATA AAC CAT TTA TAC AAT AAG CGT TGA TATAAGTTT 
T V SI S L INHLYNKR* 

GTATATACGTCATTTCGTTATATCAACAA ATG TTA TCA TAT TAT ACG TAA AACTGGCT 

M L S Y Y T • 



M.CvU\ 

TAAAAAAAAACGAGGTGTAACTATA ATG TCT TTT CGC ACG TTA G AA CT A TTT ... 

MSFR T LELF 




Amp*^ 



wo 94/21663 



2/11 PCT/US94/03246 



Figure 2 



I 10 I 20 I 30 I uo I 50 

ATG'^rrTTTf: GCACRTTAr.A AfTATTrnrr fifiTATAftrrr. nTAT-rrArA — . ^. . ^ 

81 ATTCKTAGAA ATTAATrSAAn ArcCAfAAAA attc-tcaaa acaaaktttt r^l^AT^.r^yP V^l^l ^ fijl? r.;r;-:?TAr};A 
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tB AATTTAfrAA ATCGCArT-r rfARAAfSArA TAr^ArATfiAT TArrnrnnrA rrrrrnrnrA rTr^rATr ln rlVrkrTnn ^ 

2;t '^fTTAGAArrr, GATTrnAATA TAARnAATrr nmr'r^r j n rTr.ATfiTTr.r r.rr.AArrArn niAi^AnrATl AArrrnTir 

32 ??;?JIJjrr . fjAAAArrrrr ATATnrTr.T^ rr^r^irTjiy kAVrTrr^^rA Trjg^PfliViAA 

22 ^ Jll^Ig^ ? ; GTGfinTAArT TnTCGRRrAT TAATTATAfift A;:rrrlTrAT yAlrWr r ^?? ^^Vi^.rrrjn T?r?krnATr 

21 CGAAAAG A TT ATCAArrAfiA AGAAATAATT fiTATrTfiTnA ATfirTArAAA r:T?F"Ar?nn ^.a V aaIaa;!; A^rrArrgfr . 

If TCAAGTAfiAr AATAACARTT AfriARAATTr AArrrTrr.TT r nrr rr.r.rlr. rATATTrrr.r "nT??r??Ar rAttATrAaAT 

f5 ATGrTTTr A r rr.ftTrTATTT ArAnnTSATT "j-.^nTrirr ^.Tr.nAAAApT A?r%kVrVr ^V^^.^nAr a at L?y?A!.?S?r. 

^ATGAAAGP, AArTTAr{;AT AAAr.TCATAA Arr:r:r:T:?T; Tnln;: ?i=;V ;:lr!TATV;?| PylTTT^^j^l; 

■CAlCGCgLTr rrrTAAATAT ATCcr.Tr.AAA rrArttTfiATA fV7pggAg;; ArAT^Argr.A AAlIrA^??!> 



721 r.AArACA 
801 GAAAr»flflr.TT 



88; TAGATrn PGA AATGATrAAG AAATATtn.T nrk^A^r^n V.?V;?AV?1? AA^k^^g;;;! ^g^fgi^jf^g ^gj^kei^^ 

,96 t^AG A rrGjr AGjCArAjgr ArTirrTArA rAAfi;rAnfiT TrrrATATAr. fiGfiTfiT A TRT GR A rr.ArATT TnTrT ioiAi 

fOm *TrrTr.T^.* ^^.^^^.^^^ A&AATATCTT r,r,TTATTTP.n TTrAATATftA TTa^aatatt TTCATACACT 

TACAAGCCGC TAAACGTATA ATACTCGAAA AAAAGAGACT T6AA6AGAAA 

CCACTTATAC AAAAACAAC6- AATTCCCGAA 6AGAAAAAAA CAATTCCGGA 

AAAACGAATT CCGGAAGAAA AAAAAC6AAT CGCCCAAGAG AAAAAACCAA 

TA6AAAAACA ACCAATT6CG CAAGAGAAAA TT6CGTCGGG GAGAAAAATT 

CATGAAAGAC AATTTCTCAA AGTTATAAAT TCAATCTTC6 TCGCACCCGC 

TAAATCCACA GAAATCCACA ACGTTGTAAG ATTCAGACAA TTACAAGGCA 

ATAGACAATA TAACAAACCT AAACCGGATA TAGCAGCGGT AGACATAACC 

GCATCTGAAG CATATCAACA ATATCTAAAA ATTTCTGGAA AGAACCTCAA 

ATCGTTCAAG AGAAAAGTAG TTA6TATC6C ACC6GTATCT AAAATA7GGC 

AGTCAAATTT GATTAAAAAT CAACCAATAT TCG6ATTTGA TTACGGTAAG 

GCTCAAGGAC GACCAATTAT AACAAAAAGA CGTTCCATAT TATATCTTAC 

CTTGGAGAAT TTTACTG6GA AACATCAACC CGTTTTCTAT GTAAGAACAG 

CTCTCCTCAA TGCTCTCACT TATAAAAATT TAA6ATTCTT TATACATCCA 

ATTATGTAG6 ACCATTTTCC CGA6AGACTT TGTTGACCGC GTACTAAAAA 

TAGAA6CA6G TGCAAACCTT 6ACATCGTCA 6T6TTGAGTA TACACCATTA 

6TAAATACCT ATATATACAA TACGTATCCC CCTAAAAGCC CTTAGATTTT 

6TAAGTTACA AACTAAAAGT TTCAGCTTTG CCTTCGAAAC AAGCAATTAC 

TTCTGCTAAT AAAACCATAT TTCCTACA6A AGTTTCTATG ATTAGTTCCG 

TACCGTATTT TACTTTCGTG ATCGTCGCAC CAATAAAATC ATCTCGTCTC 

AATCTCTCAC AACAACCTT6 ATGTCCATCC ATTCCTAACA CTATCGGTAA 

ATAACTATAA CACGT6TAGT T6TCGTCTAT ATCATATAAC TCGAGAGC6G 

GATCTGATCC ATAAGAACAA TCTTCATATT TACAAATAAA ATCATCCGAT 

TTTCTCTGAT CACGAATCTC CATCTCTGAA TCATTACACA CTTGCGAGTA 

ACGTTTCATA TCAACAAAAT ACATATAAAC ACCATACAAA TATTAAAACA 

AGTATATTCA CT6CA6TAAA AAAT6GCCAC GAAGCTTGTT TGAAGATCAT 

TGTTTCCCAA TCAAAATATC GAAATACACC ACTACATATT 6CAGCTCATC 

TTGACGCAG6 TGCAAACCTT 6ATATCACAG ATATTTCT6G AGGAACACCA 

ATATCTGTAC AGATGCTCGT AGAACCAGGT GCAAACCTTA GTATCATAAC 

GGCTTTTAAT GGTAATGAT6 CGATTTTCAG GATGCTCATC GTTGTAAGTG 

G6AC6GCGTT ACATTACGCG GCTTTTAATG GTCATAGCAT GT6CGTCAAC 

ATCACAGATA TTTCGCGATC TACACCACTT CATCGTGCGG TTTATAATGA 

A6CAGGTGCA ACTCTTGACC TCATTGATGA TACTGAGTG6 6TGCCGTTAC 

TTTTGAGGAT CCTCATTGAA GCAGCTCCAG ATATTGATAT ATCTAATATA 

C6AAATGGAC ACGATGT6TG TATAAAAACA CTCATCGAAG CAC6TGG7AA 

ACCACTAGAT ATTGCA6CAT GTCATGACAT T6CAGTAT6T GTGATCGTGA 

TGCGTCCGA6 TCAGTTCTCT GTCATACCAC CAACGTCTCC T6CATTACGT 

CGGCGATCGG AAGCTCCAAA CATCACAGCG CATCTTCCTC TCGCTGCAAG 

6AACCGAACA ATTTCCGAGA 6ATCTCGTT6 ATACIGTATT AATTCAATCC 

GCGTTTGCAT GAAATACAAC AC6ATCTTTT CTACATCGTT TACCATTAGT 

I9£i*i'''^^* TTTACTTTAC CTACA6TATT ACCACTTCCT TTTTTTCCTA 

CATC6CCATT AACAGACAGA 6C6TAT6AAC CGTTTT6T6C CAATTTCACC 

CGAATTTTAA CC6ATCTTAT AAGTATCTGC TTACTTCCAA CTCCTTTTTC 

CCCAGAACCT 6AAATTCTAA AGAAC6ACTG GAAAT6AATA GGTTGCATTA 

GAAAATCCTA GTCCCAATTA GGTACGTTCC ACCAAGTTTA ATACGGGGTC 

GCCTTGTAAG AATGATATGA TGTGGTTAAA TCTCTATCAC CATCGTTCCA 

TATACC7GG7 CCAACCTGTA CTAAATTCTT TATTTCAGGT CCGGCTCCGC 

AATATGATAA CCATGTTCCA ACAGTAGAAC CACTGGCTGG TATGGCAGTT 

TTCAAATCAC CAACACCTTG AGGGTTTACT TGAATACTTC TGGGAGATGT 

AGGTGGTTTC GTCGAAGGTG GTTTCGTCGA AGGTGGTTTC GTCCAAGGTG 

GTTTCGTCCA AGGTGGTTTC CTCGAAGCT6 GTTTCGTCGA AGGTGGTTTC 

GTCGAAGGTG GTTTCGTCGA AGGTGGTTTC GTCGAAGGTG GTTTC6TT66 

ATTCCC6CAT TCACCTAATQ AT6TACTCCA TAAACAACCG GGTGCCCATT 

TACATAC6AA A'aaTGACAA TCATTTTCCC TCCCAAATAA TTTACCAGAT 

ilill*'^'^*' *A"~'AAAA AAACTAACGT CTATTTAAAA TTATGTAATA 

TTCCTAACC- a-aaCCCTAC CCAATTC 
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tl21 AAATGGATAT 
1201 AAAAGAATTG 
1281 AGAGAAAAAA 
1361 TCGTCGAAGA 
1441 AGAAAGAGGA 
1521 TACTTTTGTA 
1601 GTAAAGCGAA 
1681 6GTAAAGATG 
1761 GTTCACAGGA 
1841 CTGCTAATAA 
1921 AAACCAGCAA 
2001 ATTCACTGGT 
2081 AACGGAGTAG 
2161 TACAACTTTG 
2241 ATGGTCACGA 
2321 CATCTACAT6 
2401 TTA6TT6TAT 
2481 C6CATGAGAA 
2561 AAATATT6A6 
2641 A6TTCATTCG 
2721 TCCATGTGTG 
2801 T6TGAACTTC 
2881 ATGTTCTGCA 
2961 TATAACATTA 
3041 CGTTACTATA 
3121 GCTCATTGAA 
3201 ATCGTAATGA 
3261 CTTCATCGTG 
3361 TAATTTGGGA 
3441 ATAATGTTGA 
3521 AC6CTTATT6 
3601 CCACGATGCA 
3681 ATTACGCGGC 
3761 TGTGATT6GA 
3841 CATCAACGCC 
3921 TAGTCAATAA 
4001 CATGTGTT6C 
4081 GGATACTCTA 
4161 CTGTAAA6TT 
4241 TGTATTC6TG 
4321 TAGTAGTATC 
4401 TTCAAAACGA 
4481 AAAAGCATAC 
4561 GATCTGTATA 
4641 TTTCCACCGA 
4721 CTTTCCTCT6 
4601 6TGGATTAAC 
4881 GCTGGAAGGG 
4961 T6GTG6TTTC 
5041 GTTTCGTCGA 
5121 GTCGAAGGTG 
5201 CGGAAGTGGG 
5281 CCATTCTTAT 
5361 TTCCCTTTAC 
5441 CCTATTATAT 
i 10 



AAGAAGAAAA 
CGGAAGAGAA 
CGATTCGCAC 
GAAAAAAAGA 
TCTCTACAAA 
TTCGTAGATA 
ATCCCCGACC 
TGGCATGGAT 
AAAGAATTAG 
GACCGTATGG 
GGGACAATGT 
TTTACCGCAT 
TAGCGGGAGA 
TTTCTTCAAA 
TATTTGTCTA 
TGGTGATATT 
ACTACTTTT6 
TAATATCCAT 
ATCATCGTCA 
GCAATTGTCC 
GTGTGTACGA 
TTCAGATCTA 
CACGAACAAC 
TAATTGTTGA 
TAATGGATAA 
AGAGGTAGCA 
TGTGTGTTT6 
CGGTTTT6AA 
TGGATACCGT 
CGTTATCAAT 
ATGCGGGTGC 
TCT6TGAA6A 
TTTTAATGGT 
CGGCGTTACA 
GTCAACAAAT 
6ATCCTTTC6 
GAACGACGAT 
CGAACTACTG 
AC6CTATTTT 
CAATAGAGAC 
TAAATTCAAC 
TAGTAACCCA 
AACGATCCTG 
CATATCACTT 
6ACCGGACAT 
AACCGAAGAC 
TC6AGATTCG 
AAGGTAAAAC 
GTCGAAGGTG 
AGGTGGTTTC 
GTTTCGTCCA 
GCATGACCAT 
TG6TTCTGTA 
AT6ACATTAT 
CAATGCATCA 
I 20 



CGTTTTACAA 
AAAAAGAATT 
TTGAAGAGAA 
CTTGCACTTA 
TGCAACAAAA 
TAAAAGGTAA 
CCCTATGTTC 
ATCCCATAAA 
AAGAAGTTCT 
TCTCCTATCA 
AGACATCATA 
TAAATGGGCA 
AGTATAACAA 
AACACAAC6T 
AAGAT6CTCA 
T6TATAAAC6 
TATAAGACCT 
TAT66ATGTT 
CGTTTTTCTT 
CGTGACACCA 
CCACACCGTT 
TTATTAATC6 
ATTCGTCAAA 
TATCATTATT 
CATTTGCAAT 
ATATCAATGA 
AACAT6CTTA 
TGGCCATGAC 
TACATTAC6C 
GATCGCGGTT 
AAATCTTGAC 
TACTCGTAGA 
AATGATGCGA 
TTACGCGGCT 
CGGGGGATAC 
GAGC6GCCGT 
6CCGCTTCAT 
C6TTGTGTTT 
TTTCCAAAAA 
CATACGTACC 
CCTTTGAACT 
TT6ACCTCTA 
TAAG6TTATC 
GGTTC6AAAT 
TTCAGCACGA 
CATGCATCCT 
TCAAATCTAA 
TTTACGATAT 
GTTTCGTTGA 
GTCGAAGGTG 
AGGTGGTTTC 
AATCCGTTAA 
GTATCAGATA 
TTGTAATATA 
TCTTAATCAT 
I 30 
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1 CAA GAA TAT CTT GCT TAT TTfi GTT CAA TAT CAT TAA AATATTTTGATACACTAA ATG GAT ATA 
QCYLGYLVOYD* ffldi 

66 AGA AGA AAA CST TTT ACA ATA GAA GGG GCT AAA CGT ATA ATA CTC GAA AAA AAG AGA CTT 
r .rlcr f t i egokr i i I ekk r I 

128 CAA GAG AAA AAA AGA ATT GCC GAA GAG AAA AAA AGA ATT CCA CTT ATA GAA AAA CAA CCA 
• ekkrioeekkriat i«kqr 

168 ATT 6CG GAA GAG AAA AAA AGA ATT GCG GAA GAG AAA AAA CGA TTC CCA CTT GAA GAG AAA 
'oeekkri o«ekkrfal •ek 

R.Cl/iil 

248 AAA CGA ATT GCC GAA GAA AAA AAA CGA ATC GCG GAA GAG AAA AAA CGA ATC GTG CAA GAG 

kria©»k kria©»kkri v r r 3 

308 AAA AAA AGA CTT CCA CTT ATA GAA AAA CAA CGA ATT GCG GAA GAG AAA ATT GCG TCG GGG 

JC K a I k L I E K Q g T AEEKIASG 23 

368 AGA AAA ATT AGA AAG AGG ATCItCT ACA AAT GCA ACA AAA CAT GAA AGA GAA TTT CTC AAA 

R K I R K R I |S^ T NATKHEREFVK « 

428 GTT ATA AAT TCA ATG TTC CTC CGA CCC GCT ACT TTT GTA TTC 6TA GAT ATA AAA CGT AAT 

ViNSnrVGPATFVFVOlKGN 63 

488 AAA TCC AGA GAA ATC CAC AAC CTT GTA AGA TTC AGA CAA TTA CAA GGC ACT AAA GCG AAA 

KSRElHNVVRrRQLQCSKAK 83 

548 TCC CCC ACC GCC TAT GTT GAT AGA GAA TAT AAC AAA CCT AAA GCC GAT ATA GCA GCG GTA 

SPTAYVDREYNKPICAOIAAV 103 

608 CAC ATA ACC GCT AAA CAT GTG CCA TCG ATA TCC CAT AAA GCA TCT GAA 6GA TAT CAA CAA 

OITGKDVAWISHICAS- ECYOO 123 

668 TAT CTA AAA ATT TCT CGA AAG AAC CTC AAC TTC ACA G6A AAA GAA TTA GAA GAA GTT CTA 

YLKISGICNLKFTCXELEEVL 143 

728 TCG TTC AAG AGA AAA GTA GTT ACT ATC GCA CCG GTA TCT AAA ATA TCG CCT GCT AAT AAG 

SFKRKVVSnAPV SKIWPANK 163 

788 ACC GTA TCG TCT CCT ATC AAG TCA AAT TTC ATT AAA AAT CAA CCA ATA TTC CCA TTT CAT 

TVWSPIKSNLIKNOAIFCFO 183 

848 TAC CGT AAG AAA CCA CGA AGG GAC AAT GTA GAC ATC ATA CCT CAA GCA CGA CCA ATT ATA 

YGKKPGRONVOIIGQCRPII 203 

908 ACA AAA ACA CCT TCC ATA TTA TAT CTT ACA TTC ACT CCT TTT ACC CCA TTA AAT GGG CAC 

TKRCSILYLTFTGFSALNCH 223 

968 TTC GAG AAT TTT ACT GGG AAA CAT GAA CCC CTT TTC TAT CTA AGA ACA GAA CCC AGT ACT 

tENFTCKHEPVFYVRTERSS 243 

1028 AGC GGG AGA ACT ATA ACA ACT CTC CTC AAT GCT CTC ACT TAT AAA AAT TTA AGA TTC TTT 

SGRSITTVVNGVTYICNLRFF 263 

1088 ATA CAT CCA TAC AAC TTT GTT TCT TCA AAA ACA CAA CCT ATT ATG TAC CACCATTTTCCCCAG 

iHPYNFVSSKT ORrM- 273 

1152 AGACTTTGTTCACCCCCTACTAAAAAATGCTCACCATATTT5TCTAAAGATCCTCATACAACCACCTCCAAACCTTCAC 
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BOX n. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACiONG 
Ttd» ISA found multiple inventioitt as folkyws: 

I. Claimi I'll, drawn to a polynu c le o ti d e encoding CviJI, the vector canying said polynucleotide, the tnnsfonned or 
transfected host carrying said vector, and method for producing a CviJI polypeptide from said host, classified in Class 
536, gubclasi 23.72, for example. 

n. Claims 12 and 13, drawn to the recombinant CvsJI polypeptide, classified in Class 530, subclaas 350, for example, 
m. Claims 14, 16, 19, 21, and 22, drawn to a method for resiiietion eadonuclease digestion using CviJI, classified in 
Class 435, subclass 6, for eiuunple. 

IV. Claims 14, 15, 17, 18, and 20-22, drawn to CGase I restriction endonuckase and a method for using it in 
restriction endonuclcase digestion, classified in Class 435. subclass 6, for example. 

V. Claims 23, 24, 26-38, and 41-43, drawn to a method for shotgun cloning s^ker partial digestion using CviJI. 
classified in Class 435. subclass 172.3. 

VI. Claims 23, 25-35. and 39-43, dzawn to a method for shotgun cloning after partial digestion using CGase I. 
classified in Class 435, subclass 172.3. 

Vn. Claims 44, 45, 47-53, and 55-63. drawn to a method of extension libeling of DNA and thermal eycle labeling 
using CviJI, classified in Class 435. subclass 91.1, for example. 

Vm. Claims 44, 46-52, 54-61, and 64-72, drawn to a method of extension labeling of DNA and thennal cycle labeling 
using CGase I, classified in Class 435. subclass 91.1. for examp]e.IX.Clunu73-85. drawn to s universal 
thermalcyclelabcling of DNA, classified in Class 435, subclsss 91.1, for example. 

X. Claims 86-90. drawn to a method of end labeling after CviJI digestion, dsssifirrt in Class 435, subclass 91.53. 

XI. Claims 86-90, drawn to a method of end labeling after CGase I digestion, classified in Class 435. subclass 91.53. 
Xn. Claims 91, 92, and 94-99, drawn to a method for anonymous primer cfoning after digestion with CvUI, classified 
in Class 435. subclass 172.3, for example. 

Xm. Claims 91. 93. 94-97. and 100-102. drawn to a method for anonymous primer cloning after digestion with CGase 
I, class ifif d in Class 435, subclass 1723. for exanqjle. 

Detailed Reasons for Laek of Unity 

PCT Rule 13 recites the basic principle of unity of invention that an application should relate to only one 
invention or, if there is more than one invention, that applicant would have a right to include in a single application 
only those inventions which are so linked as to form a single general inventive conceit. According to Rule 13, a group 
of inventions is linked to form a single inventive eonc^ where there is a tcchnicsl relationship among the inventions 
that involves at least one common or corresponding special technical feature that defines the contributicm which eaeh 
claimed invention, coiuidercd as a whole, makes over the prior art. 

The thirteen inventions of this application ooiuist of: 

1) a polynucleotide encoding CviJI, the vector oonqnising it, thetransformed host carrying the vector, and a method of 
making the protein using the vector, 

2) the rocombixumt pqpcide CviJI, 

3) a method for restriction eodonuclease digestion using CviJI. 

4) CGase 1 restriction erufonuckase and a method for using it in restriction endomieleaae cfigestion, 

5) a method fisr sho^un clonmg after partial digestion using CviJI. 

6) a method fiir shotgun cloning after partial digestion using CGase I. 

7) a method of exten sio n labeling of DNA and thermal cycle labeling using CviJI, 

8) a method of cmmsinn labeling of DNA and thermal cycle labeling using CGase I, 

9) a univenat tfaenoal cyele labeling of DNA. ^ 

10) a method of end labeling after CviJI digestion, 

11) a method'of end labeling after CGase I digestion, 

12) a method for anonymous primer cloning after digestion with CviJI, and 

13) a method for anonymous primer cloning after digestion with CGase I. 

The thiiteen inventions are not linked by a special technical feature within the meaning of PCT Rule 13 for 
the following reasons: Those claims drawn to CviJI are not linked to those claims drawn to CGase I because there 
is no technical relationship among these inventions that involves at least one common or corresponding special technical 
feature. 

Hie claims that involve the polynucleotide encoding CviJI. the vector containing it, the host carrying the 
vector, and methods of making recombinant protein are not linked to the recombinant protein because the protein and 
polynucleotide share a fr chnica l relationship that involves a corresponding trchnicsl feature that does not define the 
contribution which each claimed invcntiott, eonsidered as a whole, makes over the prior art because cbning and 
expression of poljmucleotides to make reeombinant polypeptides are well known in the art. Accordingly, such docs not 
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congtiiute t ipeei^ IrrhninI feature within the meaning of PCT Rule 13.2. 

Tbe mrt fao di for reatriction cndonuclease digestion, tbotgun cloning and aequeacing with Cvill. fin- ^mmm^ 
and thcnnal eyeJe labeUog with CviJI, for univenal e3rcle labelling, for end labeling after CviJI digestion, and for 
anonymoua primer oJontag after CviJI digestion involve a corresponding technical feature, digestion wiA Cvill, that 
docs not define tfae.oontnbutbn wfaieh each claimed invention, considered as a whok. makes over the prior art because 
restriction eodonuclease digestion, and shotgun cloning and sequencing, extension and thcnnal cycle labeling after m t, 
univenal cycle labelling, end labeling, and anonymous primer cloning after restriction cndonuckase digestion m well 
known in the ait. In addition. CviJI is also known in the ait. Accordingly, aucb does not constitutB a special Trrh niral 
feature within the meaning of PCT Rule 13.2. 

Similariy, the methods fiir restriction endonuclease digestion, shotgun cloning and sequencing with COascI, 
for cTO n m m i and thcnnal cycle labeling with CGasel, for universal cyde labelling, for end labeling after CGasel 
digestion, and for anonymous primer cloning after CGasel digestion involve a corresponding technical feature, digestion 
with CGasel, that does not define the contribution which each daimed invention, considered as a whole, makes over the 
prior ait because restriction endonuclease digestion, and shotgun cloning and sequencing, extension and thermal cycle 
labeling after rest,universal cycle labelling, end labeling, and anonymous primer cloning after restriction endonuclease 
digestion are well known in the ait. Aoeordingly, such does not constitute a special *?rhninil feature within the 
meaning of PCT Rule X3.2. 
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